<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
		xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
	xmlns:media="http://search.yahoo.com/mrss/"
>

<channel>
	<title>Paul Miller - The Cloud of Data &#187; Andy Powell</title>
	<atom:link href="http://cloudofdata.com/tag/andy-powell/feed/" rel="self" type="application/rss+xml" />
	<link>http://cloudofdata.com</link>
	<description>Linked Data, Cloud Computing, Semantic Web, SaaS, PaaS, more</description>
	<lastBuildDate>Thu, 17 May 2012 15:04:40 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
	<copyright>Licensed under the Creative Commons Attribution License, version 3.0 http://creativecommons.org/licenses/by/3.0/</copyright>
	<managingEditor>paul.miller@cloudofdata.com (Paul Miller)</managingEditor>
	<webMaster>paul.miller@cloudofdata.com (Paul Miller)</webMaster>
	<ttl>1440</ttl>
	<image>
		<url>http://cloudofdata.com/logo144x144.jpg</url>
		<title>Paul Miller - The Cloud of Data</title>
		<link>http://cloudofdata.com</link>
		<width>144</width>
		<height>144</height>
	</image>
	<itunes:subtitle>conversations with the executives shaping Cloud Computing and the Semantic Web.</itunes:subtitle>
	<itunes:summary>Linked Data, Cloud Computing, Semantic Web, SaaS, PaaS, more</itunes:summary>
	<itunes:keywords>Cloud Computing, Semantic Web, Linked Data, Open Data, SaaS, PaaS</itunes:keywords>
	<itunes:category text="Technology" />
	<itunes:category text="Business" />
	<itunes:author>Paul Miller</itunes:author>
	<itunes:owner>
		<itunes:name>Paul Miller</itunes:name>
		<itunes:email>paul.miller@cloudofdata.com</itunes:email>
	</itunes:owner>
	<itunes:block>no</itunes:block>
	<itunes:explicit>no</itunes:explicit>
	<itunes:image href="http://cloudofdata.com/logo300x300.jpg" />
		<item>
		<title>In a world of niche Clouds, how do you define a useful niche?</title>
		<link>http://cloudofdata.com/2010/12/in-a-world-of-niche-clouds-how-do-you-define-a-useful-niche/</link>
		<comments>http://cloudofdata.com/2010/12/in-a-world-of-niche-clouds-how-do-you-define-a-useful-niche/#comments</comments>
		<pubDate>Tue, 14 Dec 2010 13:08:20 +0000</pubDate>
		<dc:creator>Paul Miller</dc:creator>
				<category><![CDATA[Cloud computing]]></category>
		<category><![CDATA[Enterprise Computing]]></category>
		<category><![CDATA[IaaS]]></category>
		<category><![CDATA[Amazon Web Services]]></category>
		<category><![CDATA[Andy Powell]]></category>
		<category><![CDATA[Data center]]></category>
		<category><![CDATA[Eduserv]]></category>
		<category><![CDATA[FleSSR]]></category>
		<category><![CDATA[JISC]]></category>
		<category><![CDATA[Joint Information Systems Committee]]></category>
		<category><![CDATA[Rackspace]]></category>
		<category><![CDATA[VMware]]></category>

		<guid isPermaLink="false">http://cloudofdata.com/?p=1393</guid>
		<description><![CDATA[There are a couple of interesting posts on the blog of the UK&#8217;s FLESSR project, detailing their efforts to work out how feasible it might be to offer a new Cloud service to universities. More on that in a moment. I don&#8217;t think I&#8217;ve ever really been convinced by the argument that everything will end [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://geekandpoke.typepad.com/geekandpoke/2008/05/simply-explaine.html" target="_blank"><img class="alignright size-medium wp-image-1396" style="margin: 0px; border: 0px initial initial;" title="Simply Explained - Cloud Computing" src="http://cloudofdata.com/wp-content/uploads/2010/12/cloud-explained-300x214.jpg" alt="" width="300" height="214" /></a>There are a couple of interesting posts on the blog of the UK&#8217;s FLESSR project, detailing their efforts to work out how feasible it might be to offer a new Cloud service to universities. More on that in a moment.</p>
<p>I don&#8217;t think I&#8217;ve ever really been convinced by the argument that <em>everything</em> will end up in the data centres of <a class="zem_slink" title="Amazon EC2" rel="homepage" href="http://aws.amazon.com/ec2/">Amazon</a>.</p>
<p>The straightforward provision of commodity Cloud Computing is an important &#8211; and growing &#8211; area, and one that will continue to expand as interfaces become simpler, FUD is challenged, and prices maintain their relentless march towards the bottom. <em>Everyone</em> has <em>something</em> they could usefully, sensibly, and cost-effectively run in a commodity Cloud such as those offered by <a href="http://aws.amazon.com/">Amazon</a>, <a class="zem_slink" title="Rackspace" rel="homepage" href="http://www.rackspace.com">Rackspace</a>, <a href="http://www.flexiant.com/">Flexiant</a>, and others. In <em>this</em> space, basic stability, security and reliability combine with a compelling &#8211; and diminishing &#8211; pricing proposition to create commodity services targeted squarely to lowest common denominator functionality. Here, market forces may (inevitably?) lead to an eventual reduction in the number of providers. Cost, although not the only consideration, is both important and compelling. Although markets like competition, there may even be a single winner here, one day.</p>
<p>Layered all around the basic, routine, grunt-work computation that these commodity public clouds handle so well, many organisations find themselves having to cope with a wide range of <em>other</em> use cases and data sets. Some require specialist hardware (like the <a class="zem_slink" title="Graphics processing unit" rel="wikipedia" href="http://en.wikipedia.org/wiki/Graphics_processing_unit">GPUs</a> that Amazon has <a href="http://aws.typepad.com/aws/2010/11/new-ec2-instance-type-the-cluster-gpu-instance.html">recently begun selling access to</a>). Some demand particular regulatory and legislative hoops to be jumped through. Some have quirky requirements around latency in data transfer or speed of in-CPU processing. Some have <em>lots</em> of data, and issues with regard to getting the stuff from one location to another with a sensible balance between transfer cost and time.</p>
<p>All of these are certainly capable of being addressed in the Cloud, but the economics and the business rationale begin to shift. For the data owner, cost may no longer be quite so significant a factor. Reliability may matter more, or speed, or the audit trail. For the Cloud provider, these requirements no longer look like the lowest common denominator. It&#8217;s not cost-effective to provide these capabilities to <em>everyone</em> and still keep the price low. It becomes more sensible to segment, to divide, and to create bespoke offerings of various kinds. Some of these services require such specific things in terms of network topology, physical building layout, and staff expertise that it may even become counter-productive to have these services in the same building as the commodity Cloud. Here, there&#8217;s plenty of room for new entrants, plenty of scope for competition, and plenty of opportunity to differentiate in terms of price, location, support, and a host of other factors. This segment of the Cloud is only just getting started.</p>
<p>In these contexts, we see compelling arguments made for on-premise private clouds, off-premise private clouds, hybrid clouds, community clouds and the rest. Some of the arguments made in favour of private and hybrid certainly are part of the FUD we see in this space, but beneath the noise, the security scares, and the vested interests of SysAdmins and sellers of data centre components, there lies a grain of truth. Not everything is most sensibly run on a cheap VM, rented from Amazon (or Rackspace, or whoever) with your credit card, and physically located half way round the planet.</p>
<p>Unfortunately, it can be difficult to make sensible decisions about which type of cloud works best in each situation, and large swathes of the market are doing everything in their power to add to the confusion.</p>
<p>Having accepted that the basic offering from a public cloud provider is not the solution for my particular requirements, where do I turn next?</p>
<p>Do I listen to the (convincing) pitch from a vendor of &#8216;community cloud&#8217; solutions for my domain? If I&#8217;m in Healthcare, they come with HIPAA and European Data Protection Directive, and all sorts of other accreditations. For dealing with sensitive patient data, this may be just what I need&#8230; but does the wily salesman <em>also</em> persuade me to run staff email and the hospital volleyball club website on this over-specified (and expensive) infrastructure?</p>
<p>Do I listen to the (convincing) pitch from a vendor of virtualisation software? If I&#8217;ve got a reasonably sized data centre with some life left in it, I may see the value of virtualising all of that expensive hardware, and running current applications in house more efficiently. But instead of gradually reducing my in-house costs, do I continue to add more machines as current ones reach end of life, or as new requirements come along?</p>
<p>Do I listen to the (convincing) pitch from my co-location facility, which happily sells me a &#8216;private cloud&#8217; that may fail to deliver some of the economies of scale so central to the main Cloud proposition?</p>
<p>Do I listen to the horror stories, stick my head in the sand, and simply keep ordering servers until every single one of my competitors undercuts my costs and I go out of business?</p>
<p>These, and more, are certainly possible. But let&#8217;s return to that UK project I mentioned right at the start.</p>
<p>Flexible Services for the Support of Research (<a href="http://flessr.blogspot.com/">FleSSR</a>) is</p>
<blockquote><p>&#8220;a new cloud pilot project looking at utilising hybrid private-public IaaS cloud infrastructure to provide computational and data services to the academic research community. The project is a collaboration between the Oxford e-Research Center, IT Service @ University or Reading, e-Science Centre @ STFC, Eduserv, EoverI, Eucalyptus INC and Canonical Ltd.&#8221;</p></blockquote>
<p>The ten month project is funded by the Joint Information Systems Committee (<a href="http://www.jisc.ac.uk">JISC</a>), an organisation that supports the innovative use of IT across UK universities.</p>
<p>Now, to a degree, the project&#8217;s mindset must be influenced by its partners. IT staff at Reading and STFC are incumbents with turf to protect (or new vistas to discover, map, and claim). Eduserv has a new data centre that they&#8217;d like to fill with willing clients. It would be easy to be cynical, but knowing some of the people involved, I see no real reason to be. It is perfectly reasonable to suggest that a &#8216;community&#8217; the size of UK Higher Education would realise value in replicating less (not nothing) at every university campus across the country, and bringing much of that together in some sort of Cloud. That Cloud might use public infrastructure, or it might be served up from an organisation such as Eduserv, which is known to the community, aware of the community&#8217;s requirements, quirks and foibles, and (importantly) not-for profit (and therefore cheaper?).</p>
<p>Personally, I&#8217;d always rather presumed that an organisation like Eduserv (or JISC itself) would act on behalf of the community to procure a competitive price on access to the resources of Amazon, Rackspace, or one of the others. I&#8217;m not convinced that <em>most</em> UK research computation needs any sort of special treatment that couldn&#8217;t be met from Amazon&#8217;s Dublin data centre&#8230; unless the community itself can somehow beat &#8211; and continue to beat &#8211; Amazon on price. Somewhat surprisingly, that&#8217;s exactly what some calculations in <a href="http://flessr.blogspot.com/2010/12/costs-of-storage-in-cloud.html">two</a> <a href="http://flessr.blogspot.com/2010/12/costs-of-building-storage-for-cloud.html">posts</a> by Eduserv&#8217;s Andy Powell suggest could happen. By including network costs and other charges over and above the basic storage cost, Andy finds Amazon, Rackspace and Dropbox to be more expensive than anticipated, and posits that Eduserv (connected to every UK university free of charge via JISC&#8217;s high speed <a href="http://www.ja.net/">JANET</a> service, and constrained in the ways it can generate profit from services sold to universities by its charitable status) might actually work out cheaper.</p>
<p>There&#8217;s a lot of work to do in terms of fleshing out the assumptions behind some of Andy&#8217;s figures, but the whole industry certainly benefits when people conduct exercises like these out in the open, for all to see. If Andy has made mistakes, the vendors should be quick to jump in and correct them. If his assumptions miss the mark, public debate can redress the balance.</p>
<p>The Cloud is not all about price. But more transparency around the true cost of computing in the Cloud &#8211; and in your data centre &#8211; means that we can all make more informed decisions.</p>
<p>Thanks for sharing, Andy &#8211; and hopefully readers will be willing and able to look over your calculations and share their own views.</p>
<p><strong>Note</strong>: <em>this post was conceived and written in the United Kingdom. By reading this post you agree to comply with UK usage, and will henceforth pronounce the word &#8216;niche&#8217; from the title as &#8216;neesh,&#8217; not &#8216;nitch.&#8217; Or maybe not.</em></p>
<h6 class="zemanta-related-title" style="font-size: 1em;">Related articles</h6>
<ul class="zemanta-article-ul">
<li class="zemanta-article-ul-li"><a href="http://www.zdnet.com/blog/btl/rackspace-launches-managed-cloud-services/42436">Rackspace launches managed cloud services</a> (zdnet.com)</li>
<li class="zemanta-article-ul-li"><a href="http://venturebeat.com/2010/12/06/cloud-computing-public-private-hybrid-demistified/">Are hybrid clouds the path to cloud-computing nirvana?</a> (venturebeat.com)</li>
<li class="zemanta-article-ul-li"><a href="http://www.rackspacecloud.com/blog/2010/12/14/test/" class="broken_link">We&#8217;ll Take Care of Your Cloud, While You Manage Your Business</a> (rackspacecloud.com)</li>
<li class="zemanta-article-ul-li"><a href="http://www.cloudave.com/8675/trust-is-key-for-cloud-success-and-what-can-we-do-about-it/">Trust Is Key For Cloud Success And What Can We Do About It?</a> (cloudave.com)</li>
</ul>
<div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><img class="zemanta-pixie-img" style="border: none; float: right;" src="http://img.zemanta.com/pixy.gif?x-id=f19f2112-f391-4e6b-b351-c623cae0cabf" alt="" /><span class="zem-script pretty-attribution"><script src="http://static.zemanta.com/readside/loader.js" type="text/javascript"></script></span></div>
<div class="al2fb_like_button"><div id="fb-root"></div><script type="text/javascript">
(function(d, s, id) {
  var js, fjs = d.getElementsByTagName(s)[0];
  if (d.getElementById(id)) return;
  js = d.createElement(s); js.id = id;
  js.src = "//connect.facebook.net/en_US/all.js#xfbml=1&appId=133647763430045";
  fjs.parentNode.insertBefore(js, fjs);
}(document, "script", "facebook-jssdk"));
</script>
<fb:like href="http://cloudofdata.com/2010/12/in-a-world-of-niche-clouds-how-do-you-define-a-useful-niche/" layout="standard" show_faces="true" width="450" action="like" font="arial" colorscheme="light" ref="AL2FB"></fb:like></div>]]></content:encoded>
			<wfw:commentRss>http://cloudofdata.com/2010/12/in-a-world-of-niche-clouds-how-do-you-define-a-useful-niche/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Repositories in the Cloud? Why on earth not?!</title>
		<link>http://cloudofdata.com/2010/02/repositories-in-the-cloud-why-on-earth-not/</link>
		<comments>http://cloudofdata.com/2010/02/repositories-in-the-cloud-why-on-earth-not/#comments</comments>
		<pubDate>Sun, 21 Feb 2010 18:05:42 +0000</pubDate>
		<dc:creator>Paul Miller</dc:creator>
				<category><![CDATA[Cloud computing]]></category>
		<category><![CDATA[Open Data]]></category>
		<category><![CDATA[Academic publishing]]></category>
		<category><![CDATA[Amazon Web Services]]></category>
		<category><![CDATA[Andy Powell]]></category>
		<category><![CDATA[Archives]]></category>
		<category><![CDATA[AWS]]></category>
		<category><![CDATA[Colleges and Universities]]></category>
		<category><![CDATA[Eduserv]]></category>
		<category><![CDATA[Higher Education]]></category>
		<category><![CDATA[infochimps]]></category>
		<category><![CDATA[Institutional repository]]></category>
		<category><![CDATA[JISC]]></category>
		<category><![CDATA[Open access]]></category>
		<category><![CDATA[Panton Principles]]></category>
		<category><![CDATA[repcloud]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Software as a service]]></category>

		<guid isPermaLink="false">http://cloudofdata.com/?p=932</guid>
		<description><![CDATA[To be honest, I&#8217;ve never fully understood Higher Education&#8217;s penchant for building &#8216;institutional repositories.&#8217; These frequently under-populated aggregations of academic papers produced by &#8216;research active&#8217; employees of a particular university appear aligned almost exclusively to vaguely expressed institutional imperatives, and seem largely unrelated to either the selfish aspirations of the contributing authors or the tangible [...]]]></description>
			<content:encoded><![CDATA[<p>To be honest, I&#8217;ve never fully understood Higher Education&#8217;s penchant for building &#8216;<a class="zem_slink freebase/en/institutional_repository" title="Institutional repository" rel="wikipedia" href="http://en.wikipedia.org/wiki/Institutional_repository">institutional repositories</a>.&#8217; These frequently under-populated aggregations of academic papers produced by &#8216;research active&#8217; employees of a particular university appear aligned almost exclusively to vaguely expressed institutional imperatives, and seem largely unrelated to either the selfish aspirations of the contributing authors or the tangible relationships they painstakingly construct with others across their chosen discipline. The &#8216;repository&#8217; all too often appears a bureaucratic solution to a problem that the supposed beneficiaries do not recognise; a technological aberration that sits outside the conversational flow of the Web to which it is only tenuously attached.</p>
<p>Furthermore, &#8216;<a class="zem_slink freebase/en/open_access" title="Open access (publishing)" rel="wikipedia" href="http://en.wikipedia.org/wiki/Open_access_%28publishing%29">Open Access</a>&#8216; and &#8216;Repository&#8217; typically go hand in hand. If you support Open Access you need a repository, and if you question the role of repositories you&#8217;re in the pocket of evil publishers who want to lock up everything ever written and lease reading rights back to the employers of those who wrote the stuff in the first place.</p>
<p>Nonsense.</p>
<p>Open Access is an important component of today&#8217;s scholarly ecosystem. It&#8217;s not the only answer, and it&#8217;s not perfect, but it <em>does</em> have a significant part to play. Institutions have a role in preserving, disseminating and exploiting the work of their employees, but these are very different tasks that may benefit from different solutions. In too many cases, the repository is by default seen as a preservation mechanism <em>and</em> a dissemination vehicle, and as such it may fail to cost-effectively achieve either aim.</p>
<p>There are some large, well known, and research-intensive institutions where it might be possible to make a compelling argument for projecting a strong institutional image around a single &#8216;home&#8217; for all of that research output. Never mind, for a moment, that so much research today is the result of inter-institutional collaboration, or that the eminent researcher might wish to take &#8216;their&#8217; research publications with them as they move from Oxford to Harvard to York during their glittering career.</p>
<p>Alongside those institutions sit a plethora of others where research of equal quality is also being conducted; there just, maybe, isn&#8217;t quite as much of it. Bombarded by &#8216;advice&#8217; and funding, and desperate to keep up with the <a class="zem_slink freebase/en/russell_group" title="Russell Group" rel="wikipedia" href="http://en.wikipedia.org/wiki/Russell_Group">Russell Group</a>, ever-more institutions blindly join the repository cult and wonder why their new toys do not fill to overflowing with the jewels of scholarly erudition.</p>
<p>As research becomes increasingly data-rich, the whole cycle looks set to repeat. The recently released <a href="http://pantonprinciples.org/">Panton Principles</a> for <a class="zem_slink freebase/en/open_data" title="Open Data" rel="wikipedia" href="http://en.wikipedia.org/wiki/Open_Data">Open Data</a> in Science are to be welcomed, but I&#8217;ll bet the institutional response will all too often be the commissioning of a &#8216;data repository&#8217; to sit alongside the &#8216;publication repository&#8217; they already don&#8217;t use.</p>
<p>All of which is a rather long-winded way of introducing the fact that Eduserv&#8217;s <a class="zem_slink" title="Andy Powell" rel="twitter" href="http://twitter.com/andypowe11">Andy Powell</a> has asked me to facilitate a breakout afternoon on &#8216;Policy Issues&#8217; at the <a href="http://www.eduserv.org.uk/events/repcloud" class="broken_link">Repositories in the Cloud</a> event <a href="http://www.eduserv.org.uk/research">Eduserv</a> and <a class="zem_slink freebase/en/joint_information_systems_committee" title="Joint Information Systems Committee" rel="wikipedia" href="http://en.wikipedia.org/wiki/Joint_Information_Systems_Committee">JISC</a> are holding in London on Tuesday.</p>
<blockquote><p>&#8220;This free event, organised jointly by Eduserv and the JISC, will bring together software developers, repository managers, service providers, funding and advisory bodies to discuss the potential policy and technical issues associated with <strong>cloud computing</strong> and the delivery of <strong>repository services</strong> in UK HEIs.&#8221;</p></blockquote>
<p>In a post on 11 February, <a href="http://efoundations.typepad.com/efoundations/2010/02/repositories-and-the-cloud-tell-us-your-views.html">Andy invited participants to share some of their views</a> ahead of the meeting, and on 19 February <a href="http://efoundations.typepad.com/efoundations/2010/02/in-the-clouds.html">he wrote about some of his own thoughts</a>.</p>
<p>Like Andy, I struggled somewhat to nail down a coherent set of thoughts about the issue of pushing today&#8217;s repositories into the Cloud. On one level, I wonder whether the vast majority of institutions with small (and relatively low traffic) repositories would see much of a tangible efficiency gain or cost saving by moving off an in-house computer to rent an equivalent <a class="zem_slink freebase/en/virtual_machine" title="Virtual machine" rel="wikipedia" href="http://en.wikipedia.org/wiki/Virtual_machine">Virtual Machine</a> from Amazon, Rackspace, or any of their competitors. If we&#8217;re talking about IT systems within a typical university, there are others (email, calendaring, pools of compute resource for research jobs, etc) that appear more immediately compelling for the shift Cloud-ward. Which is not to say that there isn&#8217;t a clear opportunity for someone trusted to step into this space and offer a <a class="zem_slink freebase/en/software_as_a_service" title="Software as a service" rel="wikipedia" href="http://en.wikipedia.org/wiki/Software_as_a_service">SaaS</a> repository to which institutions might affordably subscribe. Eduserv? Mimas? Edina? The British Library? The National Archives? Duraspace? Any could, and if we&#8217;re not ready for something more then at least one probably should.</p>
<p>However, a bolder reconsideration of what repositories <em>are</em> and what they&#8217;re <em>for</em> might very well lead to something interesting, sustainable, and perfectly suited for benefitting from Cloud Computing&#8217;s strengths.</p>
<p>Why does a paper have to be &#8216;deposited&#8217; in a repository? Why does a single paper with three authors from three institutions have to be deposited in three separate institutional repositories? Why does that same paper have to be deposited – separately – in the subject repository favoured by scholars in the relevant discipline? Why does the institution&#8217;s very reasonable desire to protect, preserve, promote and disseminate its excellence mean that it has to run systems in perpetuity that preserve and permit access? Why do we address the fundamentally different (perhaps even contradictory) problems of access and preservation in the same system? Why can&#8217;t the individual researcher easily assemble a view across their publication history, regardless of the institution within which they happened to reside as they wrote each paper? Why don&#8217;t the assemblages of papers reflect personal, professional and disciplinary relationships, alongside (or instead of) the contractual accident of employee-employer relationships? Why isn&#8217;t the wealth of metadata implicit to any publication (authors, subjects, dates, citations, and more) available and actionable, both inside the repository and far beyond it across the Web? Why isn&#8217;t there a tight and active association between the paper and the data from which its findings were derived (something for which <em><a href="http://intarch.ac.uk/">Internet Archaeology</a></em> was demonstrating utility a very long time ago)?</p>
<p>Scholarly papers principally comprise text, augmented by the occasional static image. They&#8217;re not big, and they don&#8217;t tend to change very fast. In many ways, they represent a fairly easy problem set with which to work. As more and more data becomes key to research in a growing number of subject areas, the problems are set to become far larger and far more difficult. For individual universities to even consider replicating the process by which they all ended up with their repositories of text surely seems madness in this data-rich environment. Even with levels of uptake as low as those seen in too many text repositories, the issues of data management, curation, access and dissemination are too great to be sensibly solved in the institutional machine room. Services like <a href="http://infochimps.org/">InfoChimps</a> and Amazon&#8217;s own <a href="http://aws.amazon.com/publicdatasets/">Public Data Sets</a> offering show some of the ways that we might begin to work with data at scale. Might we, for example, come to recognise as Amazon has that it&#8217;s actually cheaper and quicker to entrust large data sets to FedEx rather than transmit them over the Internet?</p>
<p>&#8216;The answer&#8217; might be some central service for the community, funded by JISC like the Arts &amp; Humanities Data Service (AHDS) of old. Or it might be something different, something nimbler, more responsive, more flexible to individual, institutional, and disciplinary requirements, and something more scalable to new disciplines; institutional support for and use of <em>existing</em> Cloud infrastructures extending far beyond UK Higher Education, aligned with a clear understanding of the separation between preservation and access.</p>
<p>I certainly don&#8217;t have all the answers, but I do believe that simply asking whether or not we should move existing repositories to the Cloud is to miss the point. Rather, we should ask what role the Cloud might play in addressing the business requirements to which the institutional repository was our initial – faltering – response. The answer might very well be &#8216;None,&#8217; but I doubt it.</p>
<p>I look forward to Tuesday&#8217;s discussion. I&#8217;m not going there to push my personal view that individual institutions frequently shouldn&#8217;t be building, running or populating their own repositories at all. I&#8217;m going there to facilitate the discussion those in the room want to have, and to learn from their experiences and their perspectives.</p>
<h6 class="zemanta-related-title" style="font-size: 1em;">Related articles by Zemanta</h6>
<ul class="zemanta-article-ul">
<li class="zemanta-article-ul-li"><a href="http://scholarlykitchen.sspnet.org/2010/01/07/citation-advantage-for-mandated-open-access-articles/">Does a Citation Advantage Exist for Mandated Open Access Articles?</a> (scholarlykitchen.sspnet.org)</li>
<li class="zemanta-article-ul-li"><a href="http://hangingtogether.org/?p=770">Scholarly content and the cliff edge: the place of subject &#8216;repositories&#8217;</a> (hangingtogether.org)</li>
<li class="zemanta-article-ul-li"><a href="http://www.downes.ca/cgi-bin/page.cgi?post=51742">Scholarly Communications must be Scalable</a> (downes.ca)</li>
<li class="zemanta-article-ul-li"><a href="http://opendotdotdot.blogspot.com/2010/02/beyond-open-access-open-publishing.html">Beyond Open Access: Open Publishing</a> (opendotdotdot.blogspot.com)</li>
<li class="zemanta-article-ul-li"><a href="http://www.scienceblog.com/cms/57-college-presidents-declare-support-public-access-publicly-funded-research-us-25470.html" class="broken_link">57 college presidents declare support for public access to publicly funded research in the US</a> (scienceblog.com)</li>
<li class="zemanta-article-ul-li"><a href="http://r.zemanta.com/?u=http%3A//www.guardian.co.uk/education/2010/feb/11/academics-in-aspic-says-mandelson&amp;a=12898526&amp;rid=f65ff066-66fd-42d9-bc76-113bd6066317&amp;e=5236f562a8baffa164e8623f52cd7d44">Mandelson says academics are &#8216;set in aspic&#8217;</a> (guardian.co.uk)</li>
</ul>
<div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><a class="zemanta-pixie-a" title="Reblog this post [with Zemanta]" href="http://reblog.zemanta.com/zemified/f65ff066-66fd-42d9-bc76-113bd6066317/"><img class="zemanta-pixie-img" style="border: none; float: right;" src="http://img.zemanta.com/reblog_e.png?x-id=f65ff066-66fd-42d9-bc76-113bd6066317" alt="Reblog this post [with Zemanta]" /></a><span class="zem-script more-info pretty-attribution"><script src="http://static.zemanta.com/readside/loader.js" type="text/javascript"></script></span></div>
<div class="al2fb_like_button"><div id="fb-root"></div><script type="text/javascript">
(function(d, s, id) {
  var js, fjs = d.getElementsByTagName(s)[0];
  if (d.getElementById(id)) return;
  js = d.createElement(s); js.id = id;
  js.src = "//connect.facebook.net/en_US/all.js#xfbml=1&appId=133647763430045";
  fjs.parentNode.insertBefore(js, fjs);
}(document, "script", "facebook-jssdk"));
</script>
<fb:like href="http://cloudofdata.com/2010/02/repositories-in-the-cloud-why-on-earth-not/" layout="standard" show_faces="true" width="450" action="like" font="arial" colorscheme="light" ref="AL2FB"></fb:like></div>]]></content:encoded>
			<wfw:commentRss>http://cloudofdata.com/2010/02/repositories-in-the-cloud-why-on-earth-not/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Does Linked Data need RDF ?</title>
		<link>http://cloudofdata.com/2009/07/does-linked-data-need-rdf/</link>
		<comments>http://cloudofdata.com/2009/07/does-linked-data-need-rdf/#comments</comments>
		<pubDate>Sun, 19 Jul 2009 11:14:45 +0000</pubDate>
		<dc:creator>Paul Miller</dc:creator>
				<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Andy Powell]]></category>
		<category><![CDATA[Data Web]]></category>
		<category><![CDATA[Ian Davis]]></category>
		<category><![CDATA[Leigh Dodds]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[Resource Description Framework]]></category>
		<category><![CDATA[Tim Berners-Lee]]></category>
		<category><![CDATA[Twitter]]></category>
		<category><![CDATA[Uniform Resource Identifier]]></category>
		<category><![CDATA[Web 3.0]]></category>
		<category><![CDATA[Web of Data]]></category>
		<category><![CDATA[World Wide Web]]></category>
		<category><![CDATA[World Wide Web Consortium]]></category>

		<guid isPermaLink="false">http://cloudofdata.com/?p=721</guid>
		<description><![CDATA[Image by PhOtOnQuAnTiQuE via Flickr Before going any further, let&#8217;s get a few things crystal clear; The recent success of the Linked Data meme is long overdue, very welcome, and entirely capable of carrying the Web of Data far beyond its current niche adherents. A lot of my current work involves arguing that more organisations [...]]]></description>
			<content:encoded><![CDATA[<div class="zemanta-img" style="margin: 1em; display: block;">
<div>
<dl class="wp-caption alignright" style="width: 250px;">
<dt class="wp-caption-dt"><a href="http://www.flickr.com/photos/67968452@N00/3272712288"><img title="PhotonQ-Tim Berners Lee on Linked Data at TED" src="http://farm4.static.flickr.com/3449/3272712288_2ef843a4b7_m.jpg" alt="PhotonQ-Tim Berners Lee on Linked Data at TED" width="240" height="180" /></a></dt>
<dd class="wp-caption-dd zemanta-img-attribution" style="font-size: 0.8em;">Image by <a href="http://www.flickr.com/photos/67968452@N00/3272712288">PhOtOnQuAnTiQuE</a> via Flickr</dd>
</dl>
</div>
</div>
<p>Before going any further, let&#8217;s get a few things <em>crystal</em> clear;</p>
<ol>
<li>The recent success of the <a class="zem_slink" title="Linked Data" rel="wikipedia" href="http://en.wikipedia.org/wiki/Linked_Data">Linked Data</a> meme is long overdue, very welcome, and entirely capable of carrying the Web of Data far beyond its current niche adherents. A lot of my current work involves arguing that more organisations should adopt this approach;</li>
<li>The <a class="zem_slink" title="Resource Description Framework" rel="wikipedia" href="http://en.wikipedia.org/wiki/Resource_Description_Framework">Resource Description Framework</a>, RDF, is a key — and powerful — piece in <a class="zem_slink" title="World Wide Web Consortium" rel="homepage" href="http://www.w3.org/">W3C</a>&#8216;s <a class="zem_slink" title="Semantic Web" rel="wikipedia" href="http://en.wikipedia.org/wiki/Semantic_Web">Semantic Web</a> Architecture. Since its <a href="http://www.ukoln.ac.uk/metadata/resources/dc/datamodel/WD-dc-rdf/">earliest days</a>, I have played various parts in advocating the potential of RDF and will continue to do so;</li>
<li>RDF is an obvious means of publishing — and consuming — Linked Data powerfully, flexibly, and interoperably. I will continue to argue this, and to advocate its wider adoption.</li>
</ol>
<p>So far, so good.</p>
<p>The problem, I contend, comes when well-meaning and knowledgeable advocates of both Linked Data and RDF conflate the two and infer, imply or assert that &#8216;Linked Data&#8217; can only be Linked Data if expressed in RDF.</p>
<p>This dogmatism makes me deeply uncomfortable, and I find myself unable to agree with the underlying premise.</p>
<p>The rest of this post attempts to explain why, hopefully more lucidly than I or those with whom I was debating managed on Friday evening via the largely unsuitable medium of the 140 character tweet.</p>
<p>Andy Powell started things off lucidly enough on Friday, <a href="http://twitter.com/andypowe11/statuses/2687499113">asking</a>;</p>
<blockquote><p>&#8220;is there an agreed name for an approach that adopts the 4 principles of #linkeddata minus the phrase, &#8216;using the standards (RDF, SPARQL)&#8217; ??&#8221;</p></blockquote>
<p>I was amongst those to respond, <a href="http://twitter.com/PaulMiller/statuses/2687580097">suggesting</a> as I usually do that;</p>
<blockquote><p>&#8220;well, personally, I&#8217;d argue that Linked Data does NOT require that phrase. But I know others disagree&#8230;  <img src='http://cloudofdata.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> &#8221;</p></blockquote>
<p>Other pieces of that conversation can be extracted from <a href="http://search.twitter.com/search?q=&amp;ands=&amp;phrase=&amp;ors=&amp;nots=&amp;tag=linkeddata&amp;lang=all&amp;from=&amp;to=&amp;ref=&amp;near=&amp;within=15&amp;units=mi&amp;since=2009-07-17&amp;until=2009-07-17&amp;rpp=50">the stream</a>; start by scrolling to the bottom, find Andy&#8217;s tweet, and work back toward the top.</p>
<p>It&#8217;s worth noting that two of those arguing most vehemently against me were former colleagues <a href="http://iandavis.com/blog/">Ian Davis</a> and <a href="http://www.ldodds.com/">Leigh Dodds</a>. I have massive respect for the technical prowess of both (which is certainly greater than my own), and have learned a great deal from Ian in particular over the years that we have known one another. <em>This</em> issue, though, is one on which we have long disagreed, and it was interesting to see the subject of many a difference of opinion in the bars of various conference hotels spill into this public arena.</p>
<p>Anyway, now let me try to explain what I meant.</p>
<p>Perhaps the most commonly cited definition for Linked Data is the one to which Andy was referring; <a class="zem_slink" title="Tim Berners-Lee" rel="wikipedia" href="http://en.wikipedia.org/wiki/Tim_Berners-Lee">Sir Tim Berners-Lee</a>&#8216;s <a href="http://www.w3.org/DesignIssues/LinkedData.html"><em>Linked Data &#8211; Design Issues</em></a> document. It&#8217;s worth noting that this document is clearly flagged (in the current version amended on 18 June 2009, at least) as being both a &#8216;personal view only&#8217; and &#8216;imperfect but published.&#8217; So a very long way from being a &#8216;standard,&#8217; &#8216;specification,&#8217; or &#8216;definition,&#8217; but certainly still a pretty good starting point, and one to which I often direct clients and others.</p>
<p>Berners-Lee begins,</p>
<blockquote><p>&#8220;The Semantic Web isn&#8217;t just about putting data on the web. It is about making links, so that a person or machine can explore the web of data.  <strong>With linked data, when you have some of it, you can find other, related, data.</strong>&#8221;</p>
<p>(my emphasis)</p></blockquote>
<p>That sounds good, doesn&#8217;t it? Indeed, we talked about that on the Linked Data panel I moderated at the recent <a href="http://semanticconference.com/">Semantic Technology Conference</a>, and I&#8217;ve embedded the <a href="http://vimeo.com/channels/semtech2009#5492097">video</a> here.</p>
<p style="text-align: center;"><iframe src="http://player.vimeo.com/video/5492097" width="400" height="225" frameborder="0"></iframe></p>
<p>It is the next section of Berners-Lee&#8217;s document that is used to validate the view that Linked Data needs RDF;</p>
<blockquote><p>&#8220;1. Use URIs as names for things</p>
<p>2. Use HTTP URIs so that people can look up those names</p>
<p>3. When someone looks up a URI, provide useful information, <strong>using the standards (RDF, SPARQL)</strong></p>
<p>4. Include links to other URIs. so that they can discover more things.</p>
<p>(my emphasis)&#8221;</p></blockquote>
<p>On one reading, an unambiguous validation of the view with which I disagree. On another, a <em>suggestion</em> of best practice, expressed as part of a <em>&#8216;personal </em>view&#8217; with which we are perfectly entitled to take issue.</p>
<p>Would the zealots be calmed by the simple insertion of &#8216;preferably&#8217; or &#8216;ideally,&#8217; immediately after point three&#8217;s second comma? Maybe. Or perhaps the fires of Linked Data&#8217;s self-appointed Inquisition would be stoked for Berners-Lee himself.</p>
<p>Talk of Linked Data, Open Data, the Web of Data and related concepts in recent years have led to a quite remarkable shift in attitude amongst individuals, public bodies and private corporations. Almost everywhere my work takes me, clever people are seriously grappling with the implications of <em>consuming</em> from or <em>contributing</em> to these emerging ecosystems. Not all of their questions have good answers, and not all of the technological, strategic and business implications have necessarily been fully worked through. But these people are <em>asking</em> the questions, and they are asking them in all seriousness.That is a dramatic and welcome shift.</p>
<p>Some, such as the BBC, Thomson Reuters and the UK Government&#8217;s Central Office of Information are sufficiently persuaded of the benefits to take risks and to open the previously closed in taking a lead. Others will follow, as fears are assuaged, doubts eased, and benefits realised.</p>
<p>Despite this undoubted progress, the green shoots of a Linked Data ecology remain delicate. By moving from a message that stresses the value of unambiguous and web-addressable naming (HTTP URIs), providing &#8216;useful information,&#8217; and enabling people to &#8216;discover more things&#8217; by linking toward a message that elevates one of the <em>best</em> mechanisms (RDF) for achieving this to become the <em>only</em> permissible approach, we do the broader aims great harm.</p>
<p>Yes, those already in the club will probably be very pleased with the purity and functionality of the toys in their playground. But they will have barred a far larger group with data to share, a willingness to learn, and an enthusiasm to engage. At best, they will have slowed the growth of the pool of Linked Data quite dramatically. At worst, they will have created an increasingly irrelevant backwater that more pragmatic people will simply route around. Perhaps, in their pragmatism, those people will now <em>never</em> look seriously at RDF and its power, scared away by the fervour of those who sought to elevate it too high, and too fast.</p>
<p>What are we after? More Linked Data, or more RDF? I sincerely hope it&#8217;s the former.</p>
<p>So let&#8217;s see loads more Linked Data, and plenty of evangelism as to why RDF could be the <em>best</em> way to do it. But let&#8217;s not ostracise the vast majority of potential participants, contributors and beneficiaries in the world of Linked Data, just because they haven&#8217;t wholeheartedly embraced RDF yet.</p>
<h6 class="zemanta-related-title" style="font-size: 1em;">Related articles by Zemanta</h6>
<ul class="zemanta-article-ul">
<li class="zemanta-article-ul-li"><a href="http://www.readwriteweb.com/archives/linked_data_is_blooming_why_you_should_care.php"> Linked Data is Blooming: Why You Should Care </a> (readwriteweb.com)</li>
<li class="zemanta-article-ul-li"><a href="http://blogs.talis.com/nodalities/2009/04/ivan-herman-talks-about-the-semantic-web-and-w3c.php"> Nodalities (Talis): Ivan Herman talks about the Semantic Web and W3C </a> (blogs.talis.com)</li>
<li class="zemanta-article-ul-li"><a href="http://www.readwriteweb.com/archives/interview_with_tim_berners-lee_part_1.php"> ReadWriteWeb Interview With Tim Berners-Lee, Part 1: Linked Data </a> (readwriteweb.com)</li>
<li class="zemanta-article-ul-li"><a href="http://derivadow.com/2009/03/31/linking-bbccouk-to-the-linked-data-cloud/"> Linking bbc.co.uk to the Linked Data cloud </a> (derivadow.com)</li>
<li class="zemanta-article-ul-li"><a href="http://blog.semantic-web.at/2009/04/22/tim-berners-lee-we-need-data-on-the-web-to-work-better-together/"> Tim Berners-Lee: &#8220;We need data on the Web to work better together&#8221; </a> (semantic-web.at)</li>
<li class="zemanta-article-ul-li"><a href="http://derivadow.com/2009/03/26/what-does-the-history-of-the-web-tell-us-about-its-future/"> What does the history of the web tell us about its future? </a> (derivadow.com)</li>
<li class="zemanta-article-ul-li"><a href="http://byronmiller.typepad.com/byronmiller/2009/07/tim-bernerslee-eloquent-ted-speech-on-linked-data.html"> Tim Berners-Lee&#8217;s Eloquent Ted Speech on Linked Data </a> (byronmiller.typepad.com)</li>
<li class="zemanta-article-ul-li"><a href="http://www.bbc.co.uk/blogs/technology/2009/04/sir_tim_the_web_and_silos.html"> Sir Tim, the web and silos </a> (bbc.co.uk)</li>
<li class="zemanta-article-ul-li"><a href="http://mndoci.com/blog/2009/03/28/talis-connected-commons-linked-open-data-repository-opens-up-shop/" class="broken_link">Talis Connected Commons: Linked open data repository opens up shop</a> (mndoci.com)</li>
<li class="zemanta-article-ul-li"><a href="http://blogs.talis.com/nodalities/2009/03/jeff-pollock-talks-about-his-new-book-the-semantic-web-for-dummies.php">Jeff Pollock talks about his new book, The Semantic Web for Dummies</a> (blogs.talis.com)</li>
<li class="zemanta-article-ul-li"><a href="http://www.readwriteweb.com/archives/cnet_partners_with_thomson_reuters_on_linked_data.php"> CNET Partners with Thomson Reuters on Linked Data Initiative </a> (readwriteweb.com)</li>
<li class="zemanta-article-ul-li"><a href="http://www.readwriteweb.com/archives/readwriteweb_interview_with_tim_berners-lee_part_2.php"> ReadWriteWeb Interview With Tim Berners-Lee, Part 2: Search Engines, User Interfaces for Data, Wolfram Alpha, And More&#8230; </a> (readwriteweb.com)</li>
<li class="zemanta-article-ul-li"><a href="http://cloudofdata.com/2009/07/how-open-is-open/"> How Open is &#8216;Open&#8217; ? </a> (cloudofdata.com)</li>
<li class="zemanta-article-ul-li"><a href="http://go-to-hellman.blogspot.com/2009/07/crossref-openurl-and-more-linked-data.html"> Crossref, OpenURL and more Linked Data Heresy </a> (go-to-hellman.blogspot.com)</li>
</ul>
<div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><a class="zemanta-pixie-a" title="Reblog this post [with Zemanta]" href="http://reblog.zemanta.com/zemified/97d330c5-4a35-403e-b18b-dd5e970d306e/"><img class="zemanta-pixie-img" style="border: medium none; float: right;" src="http://img.zemanta.com/reblog_e.png?x-id=97d330c5-4a35-403e-b18b-dd5e970d306e" alt="Reblog this post [with Zemanta]" /></a><span class="zem-script pretty-attribution"><script src="http://static.zemanta.com/readside/loader.js" type="text/javascript"></script></span></div>
<div class="al2fb_like_button"><div id="fb-root"></div><script type="text/javascript">
(function(d, s, id) {
  var js, fjs = d.getElementsByTagName(s)[0];
  if (d.getElementById(id)) return;
  js = d.createElement(s); js.id = id;
  js.src = "//connect.facebook.net/en_US/all.js#xfbml=1&appId=133647763430045";
  fjs.parentNode.insertBefore(js, fjs);
}(document, "script", "facebook-jssdk"));
</script>
<fb:like href="http://cloudofdata.com/2009/07/does-linked-data-need-rdf/" layout="standard" show_faces="true" width="450" action="like" font="arial" colorscheme="light" ref="AL2FB"></fb:like></div>]]></content:encoded>
			<wfw:commentRss>http://cloudofdata.com/2009/07/does-linked-data-need-rdf/feed/</wfw:commentRss>
		<slash:comments>45</slash:comments>
		</item>
	</channel>
</rss>

