<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
		xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
	xmlns:media="http://search.yahoo.com/mrss/"
>

<channel>
	<title>Paul Miller - The Cloud of Data &#187; Glue</title>
	<atom:link href="http://cloudofdata.com/tag/glue/feed/" rel="self" type="application/rss+xml" />
	<link>http://cloudofdata.com</link>
	<description>Linked Data, Cloud Computing, Semantic Web, SaaS, PaaS, more</description>
	<lastBuildDate>Fri, 10 Feb 2012 10:46:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<copyright>Licensed under the Creative Commons Attribution License, version 3.0 http://creativecommons.org/licenses/by/3.0/</copyright>
	<managingEditor>paul.miller@cloudofdata.com (Paul Miller)</managingEditor>
	<webMaster>paul.miller@cloudofdata.com (Paul Miller)</webMaster>
	<ttl>1440</ttl>
	<image>
		<url>http://cloudofdata.com/logo144x144.jpg</url>
		<title>Paul Miller - The Cloud of Data</title>
		<link>http://cloudofdata.com</link>
		<width>144</width>
		<height>144</height>
	</image>
	<itunes:subtitle>conversations with the executives shaping Cloud Computing and the Semantic Web.</itunes:subtitle>
	<itunes:summary>Linked Data, Cloud Computing, Semantic Web, SaaS, PaaS, more</itunes:summary>
	<itunes:keywords>Cloud Computing, Semantic Web, Linked Data, Open Data, SaaS, PaaS</itunes:keywords>
	<itunes:category text="Technology" />
	<itunes:category text="Business" />
	<itunes:author>Paul Miller</itunes:author>
	<itunes:owner>
		<itunes:name>Paul Miller</itunes:name>
		<itunes:email>paul.miller@cloudofdata.com</itunes:email>
	</itunes:owner>
	<itunes:block>no</itunes:block>
	<itunes:explicit>no</itunes:explicit>
	<itunes:image href="http://cloudofdata.com/logo300x300.jpg" />
		<item>
		<title>Is there a disconnect between Big Data and the Web of Data ?</title>
		<link>http://cloudofdata.com/2010/11/is-there-a-disconnect-between-big-data-and-the-web-of-data/</link>
		<comments>http://cloudofdata.com/2010/11/is-there-a-disconnect-between-big-data-and-the-web-of-data/#comments</comments>
		<pubDate>Tue, 30 Nov 2010 16:36:08 +0000</pubDate>
		<dc:creator>Paul Miller</dc:creator>
				<category><![CDATA[Cloud computing]]></category>
		<category><![CDATA[Enterprise Computing]]></category>
		<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[Open Data]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Web 3.0]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[BigData]]></category>
		<category><![CDATA[Defrag]]></category>
		<category><![CDATA[Glue]]></category>
		<category><![CDATA[LinkedData]]></category>
		<category><![CDATA[OpenData]]></category>
		<category><![CDATA[strataconf]]></category>
		<category><![CDATA[structureconf]]></category>

		<guid isPermaLink="false">http://cloudofdata.com/?p=1308</guid>
		<description><![CDATA[Image via Wikipedia &#8216;Big Data&#8216; is currently capturing the imagination, attracting hype, investment and ambitious startups in almost equal measure. Kim and Eric Norlin&#8217;s excellent Defrag and Glue events have gained big-name company, with O&#8217;Reilly&#8216;s Strata and GigaOM&#8216;s Structure both set to arrive in the first quarter of 2011. Venture firms like IA Ventures have emerged, specifically [...]]]></description>
			<content:encoded><![CDATA[<div class="zemanta-img" style="margin: 1em; display: block;">
<div>
<dl class="wp-caption alignright" style="width: 310px;">
<dt class="wp-caption-dt"><a href="http://commons.wikipedia.org/wiki/File:WorldWideWebAroundWikipedia.png"><img title="A data visualization of Wikipedia as part of t..." src="http://upload.wikimedia.org/wikipedia/commons/thumb/b/b9/WorldWideWebAroundWikipedia.png/300px-WorldWideWebAroundWikipedia.png" alt="A data visualization of Wikipedia as part of t..." width="300" height="216" /></a></dt>
<dd class="wp-caption-dd zemanta-img-attribution" style="font-size: 0.8em;">Image via <a href="http://commons.wikipedia.org/wiki/File:WorldWideWebAroundWikipedia.png">Wikipedia</a></dd>
</dl>
</div>
</div>
<p>&#8216;<a class="zem_slink" title="Big data" rel="wikipedia" href="http://en.wikipedia.org/wiki/Big_data">Big Data</a>&#8216; is currently capturing the imagination, attracting hype, investment and ambitious startups in almost equal measure. Kim and Eric Norlin&#8217;s excellent <a href="http://www.defragcon.com/">Defrag</a> and <a href="http://www.gluecon.com/">Glue</a> events have gained big-name company, with <a href="http://conferences.oreillynet.com/">O&#8217;Reilly</a>&#8216;s <a href="http://strataconf.com/strata2011">Strata</a> and <a href="http://gigaom.com/events/">GigaOM</a>&#8216;s <a href="http://gigaom.com/bigdata/">Structure</a> both set to arrive in the first quarter of 2011. Venture firms like <a href="http://www.iaventurepartners.com/">IA Ventures</a> have emerged, specifically targeted at finding, funding, and profiting from the <em>big</em> Big Data idea. Giants of the web from <a class="zem_slink" title="Yahoo!" rel="homepage" href="http://www.yahoo.com">Yahoo!</a> and <a href="http://www.amazon.com/">Amazon</a> to <a class="zem_slink" title="Twitter" rel="homepage" href="http://twitter.com">Twitter</a> and <a class="zem_slink" title="Facebook" rel="homepage" href="http://facebook.com">Facebook</a> solve their own Big Data problems in very different ways, contributing valuable code and experience to the community whilst simultaneously diluting focus and adding to the cacophony.</p>
<p>Flippantly reckoned by many to be &#8216;anything that requires more than a single machine to run,&#8217; the Big Data reality remains somewhat harder to pin down. To those seeking routine business insight, that mammoth Excel spreadsheet they laboriously query overnight at the end of each month might quite justifiably be thought of as &#8216;Big.&#8217; At the other end of the scale, data wizards scorn anything that doesn&#8217;t require a room full of servers, a mountain of empty pizza boxes, and the careful construction of a bespoke data ingest, management and querying system atop the most bare-bones version of the Linux kernel they can find. Somewhere between the two, a growing mass of cheaply gathered data holds out the promise of invaluable insight. Remote sensors, web clickstreams, social graph interactions, purchaser (and non-purchaser) behaviours. All these, and more, have much to tell planners, builders, makers, sellers, and buyers. If only we could formulate the right questions. If only we could devise the right sampling strategies. If only we had big enough machines to ask lots of questions using lots of sampling strategies. If only we had big enough machines to not bother sampling at all.</p>
<p>On the hardware side of things, even humble domestic laptops typically ship with at least two cores these days; two separate little computers ready to do the data processor&#8217;s bidding. Four, eight, sixteen and more cores are not far behind, but mainstream software products typically fail to exploit anything more than a single core. Push Excel as hard as you like, and it won&#8217;t do more than take <em>one</em> of your computer&#8217;s multiple cores to the max. On that 12-core Mac Pro you persuaded the boss to buy, only one core will be hard at work on your data. Twitter, Mail, YouTube, and ripping DVDs  will each be giving other cores a little light exercise whilst others sit idly by, waiting for the arrival of operating systems and applications capable of exploiting multi-core power. The same is true as jobs grow and move to run across multiple machines, whether under your desk, in your data centre, or out in the Cloud. Those big datasets need to be carved up and shared amongst the available computers before any analysis takes place. You&#8217;re typically not accessing a &#8216;big computer in the Cloud&#8217; at all&#8230; but lots of relatively small (commodity) computers, and it takes careful planning and smart software to manage the division and recombination of those jobs in a cost-effective manner. Projects such as <a href="http://db.cs.berkeley.edu/jmh/">Joseph Hellerstein</a>&#8216;s Berkeley Orders of Magnitude (<a href="http://boom.cs.berkeley.edu/">BOOM</a>) begin to demonstrate some of the potential for working natively with multiple processors, but there&#8217;s a long way to go before those advances reach the mainstream.</p>
<p><a href="http://en.wikipedia.org/wiki/Hadoop">Hadoop</a>, <a href="http://en.wikipedia.org/wiki/Apache_Cassandra">Cassandra</a>, <a class="zem_slink" title="MapReduce" rel="wikipedia" href="http://en.wikipedia.org/wiki/MapReduce">MapReduce</a>, <a href="http://en.wikipedia.org/wiki/Dynamo_(storage_system)">Dynamo</a>, <a href="http://en.wikipedia.org/wiki/Project_Voldemort#SNA_LinkedIn">Voldemort</a>. These, and more, are solutions developed by the likes of Yahoo!, Facebook, Google, Amazon and <a class="zem_slink" title="LinkedIn" rel="homepage" href="http://www.linkedin.com">LinkedIn</a> to tackle the influx of data that each faced &#8211; and for which each had failed to find an existing solution. Hadoop, with the addition of <a href="http://www.cloudera.com/">Cloudera</a>&#8216;s commercial polish, is rapidly emerging as the front runner for an off the shelf Big Data solution, but all of these tools remain rather narrow in their abilities. Find the type of data or the nature of query for which each of these was built and its performance will be unbeatable, but we are a very long way from Big Data&#8217;s equivalent of the jack-of-all-trades SQL-powered relational database of old.</p>
<p>And there, for many enterprises, lies the problem. Useful Google searches require the crawler, index and UI to do a relatively small number of essentially similar tasks, very quickly, very cost-effectively, and at massive scale. Focus on that finite set of problems, and you build a solution that delivers the experience we&#8217;ve all come to know. Each type of data manipulation or analysis requires a different tool, differently optimised, with the inevitable result that a typically diverse organisation may require a plethora of Big Data tools to get their work done. Or they might just continue to muddle along with Oracle or <a class="zem_slink" title="MySQL" rel="homepage" href="http://www.mysql.com">mySQL</a>, churning inefficiently through their data analysis jobs for interminably long periods of time. These relational database tools are understood, they are mature, and they get the job done. Except in the most data-intensive industries, they have a market presence that will be difficult to disrupt.</p>
<p>The Big Data space is seeing remarkable innovation, but there is a long way to go in order to lift it out of the domain of the technically proficient specialist and place it on desktops across the organisation. As IA Ventures&#8217; Brad Gillespie notes, &#8220;Excel is where the world&#8217;s data lives&#8230; [and] Big Data has to get to that place&#8230; so that a CMO can leverage it directly.&#8221;</p>
<p>And in all of this fervent of innovation, to return to the title of the post, it strikes me that Big Data is becoming disconnected from the fabric of the web itself. Oh, much of the data certainly <em>comes</em> from the Web, and a lot of it might even be queried on the Web after processing. But, somewhere along the line, the <em>linkedness</em> of the Web has either been forgotten or ignored. That rich set of connections, interconnections and associations has been reduced to a table, an index, or a (large) set of key-value pairs. And in the process, something fundamental has gone away.</p>
<p>This is enough for now, though. Looking more closely at different Big Data approaches, and exploring the potential for re-introducing the Web must wait for future posts.</p>
<h6 class="zemanta-related-title" style="font-size: 1em;">Related articles</h6>
<ul class="zemanta-article-ul">
<li class="zemanta-article-ul-li"><a href="http://venturebeat.com/2010/10/26/cloudera-raises-25m-to-help-deal-with-the-enterprise-data-deluge/">Cloudera raises $25M to help deal with the enterprise data deluge</a> (venturebeat.com)</li>
<li class="zemanta-article-ul-li"><a href="http://radar.oreilly.com/2010/10/strata-week-building-data-star.html">Strata Week: Building data startups</a> (radar.oreilly.com)</li>
<li class="zemanta-article-ul-li"><a href="http://www.readwriteweb.com/cloud/2010/09/hadoop-and-a-critique-on-geek.php">Big Data and a Critique of Geek Culture</a> (readwriteweb.com)</li>
<li class="zemanta-article-ul-li"><a href="http://www.nytimes.com/external/gigaom/2010/10/30/30gigaom-big-data-and-nosql-march-to-the-enterprise-73963.html">Big Data and NoSQL March to the Enterprise</a> (nytimes.com)</li>
<li class="zemanta-article-ul-li"><a href="http://news.cnet.com/8301-21546_3-20023969-10253464.html?part=rss&amp;subj=news">Does &#8216;big data&#8217; equal big opportunity for storage vendors?</a> (news.cnet.com)</li>
<li class="zemanta-article-ul-li"><a href="http://blog.programmableweb.com/2010/11/29/new-york-times-event-shows-the-promise-of-big-data/">New York Times Event Shows the Promise of Big Data</a> (programmableweb.com)</li>
<li class="zemanta-article-ul-li"><a href="http://www.readwriteweb.com/enterprise/2010/11/executives-are-addicted-to-big.php">Overwhelmed Executives Still Crave Big Data, Says Survey</a> (readwriteweb.com)</li>
</ul>
<div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><img class="zemanta-pixie-img" style="border: none; float: right;" src="http://img.zemanta.com/pixy.gif?x-id=5578c8b2-c0db-4f2b-b846-1aac2b8adc42" alt="" /><span class="zem-script pretty-attribution"><script src="http://static.zemanta.com/readside/loader.js" type="text/javascript"></script></span></div>
]]></content:encoded>
			<wfw:commentRss>http://cloudofdata.com/2010/11/is-there-a-disconnect-between-big-data-and-the-web-of-data/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Opening up and letting go to strengthen market position</title>
		<link>http://cloudofdata.com/2009/05/opening-up-and-letting-go-to-strengthen-market-position/</link>
		<comments>http://cloudofdata.com/2009/05/opening-up-and-letting-go-to-strengthen-market-position/#comments</comments>
		<pubDate>Tue, 19 May 2009 12:57:14 +0000</pubDate>
		<dc:creator>Paul Miller</dc:creator>
				<category><![CDATA[Cloud computing]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Web 2.0]]></category>
		<category><![CDATA[Web 3.0]]></category>
		<category><![CDATA[AdaptiveBlue]]></category>
		<category><![CDATA[Alex Iskold]]></category>
		<category><![CDATA[Bert Armijo]]></category>
		<category><![CDATA[Glue]]></category>
		<category><![CDATA[Semantic Web Gang]]></category>

		<guid isPermaLink="false">http://cloudofdata.com/?p=636</guid>
		<description><![CDATA[Two separate pieces of news came my way during the night, and although both were written about elsewhere whilst those of us on this side of the Atlantic slept, they remain worthy of mention; both in their own right and because of the wider trend of which they are part. First, Cloud Computing provider 3Tera [...]]]></description>
			<content:encoded><![CDATA[<div class="zemanta-img" style="margin: 1em; display: block;">
<div class="wp-caption alignright" style="width: 220px"><a href="http://commons.wikipedia.org/wiki/Image:Bolton-newton.jpg"><img class=" " style="margin: 6px;" title="Isaac Newton (Bolton, Sarah K. Famous Men of S..." src="http://upload.wikimedia.org/wikipedia/commons/thumb/4/43/Bolton-newton.jpg/300px-Bolton-newton.jpg" alt="Isaac Newton (Bolton, Sarah K. Famous Men of S..." width="210" height="281" /></a><p class="wp-caption-text">Image via Wikipedia</p></div>
</div>
<p>Two separate pieces of news came my way during the night, and although both were written about elsewhere whilst those of us on this side of the Atlantic slept, they remain worthy of mention; both in their own right and because of the wider trend of which they are part.</p>
<p>First, Cloud Computing provider <a href="http://www.3tera.com/">3Tera</a> <a href="http://blog.3tera.com/computing/3tera-announces-appstore-for-cloud-computing-appliances/">announced</a> their <a href="http://www.3tera.com/AppStore/" class="broken_link">AppStore</a>;</p>
<blockquote><p>&#8220;the first marketplace for cloud components where enterprise users, software vendors and datacenter experts can exchange production-ready, scalable and highly available cloud components on a pay-per-use basis.&#8221;</p></blockquote>
<p>You can listen to my recent podcast with 3Tera&#8217;s Bert Armijo, <a href="http://cloudofdata.com/2009/04/a-podcast-conversation-with-3tera-co-founder-bert-armijo/">here</a>.</p>
<p>Second, Semantic Technology pioneer <a href="http://www.adaptiveblue.com/">AdaptiveBlue</a>&#8216;s CEO (and <a href="http://semanticgang.talis.com/" class="broken_link">Semantic Web Gang</a> regular) <a class="zem_slink" title="Alex Iskold" rel="homepage" href="http://www.readwriteweb.com/about_alex.php">Alex Iskold</a> sent an email to point me at <a href="http://blog.adaptiveblue.com/?p=2315">their launch</a> of a new <a href="http://www.getglue.com/api">Glue API</a>;</p>
<blockquote><p>&#8220;This new API taps into Glue’s databases and semantic recognition engine enabling fun &amp; useful applications about people and things.&#8221;</p></blockquote>
<p>Both companies recognise that there is a far greater pool of talent <em>outside</em> their employ than <em>inside</em>, and both are seeking to place themselves and their technology at the centre of a scalable ecosystem rather than perpetuating the old fashioned model of supplying solutions. Both recognise, too, that the &#8216;sticky&#8217; destination site is becoming increasingly irrelevant to many of our online behaviours. We want and need functionality, community, reliability and more; but we want it on <em>our</em> terms, delivered in real time at the point of need.</p>
<p>By seeking to build and mediate a critical mass of third party applications, 3Tera is repeating a formula successfully demonstrated by the likes of Apple, Salesforce and others. Partners and developers deliver far more working code than 3Tera&#8217;s own developers could manage, and those partners are incentivised to bring their own customer base along with them. 3Tera benefits every time one of these partners makes a sale — for negligible effort on 3Tera&#8217;s part — and has an easy route to new customers of its own via its partners. In time, 3Tera&#8217;s own AppLogic may even come to increasingly be perceived as no more than an on-ramp to the AppStore&#8217;s riches.</p>
<p>AdaptiveBlue&#8217;s API extends Glue in an obvious direction, adding to site-independent strengths upon which <a href="http://blogs.zdnet.com/semantic-web/?p=266">I have remarked previously</a>.</p>
<p>The world is changing. By embracing the power of networks (both technological and social) and putting others to work on your behalf, companies are increasingly able to punch far beyond their own weight. It is ever-more feasible for small, agile, responsive and engaged organisations to draw upon the resources of others to mutual benefit. As Newton once wrote,</p>
<blockquote><p>&#8220;If I have seen further it is only by standing on the shoulders of giants.&#8221;</p></blockquote>
<h6 class="zemanta-related-title" style="font-size: 1em;">Related articles by Zemanta</h6>
<ul class="zemanta-article-ul">
<li class="zemanta-article-ul-li"><a href="http://www.readwriteweb.com/archives/3tera_app_store.php"> 3Tera to Support AppLogic with New AppStore, Now Seeking Cloudware Vendors </a> (readwriteweb.com)</li>
<li class="zemanta-article-ul-li"><a href="http://www.readwriteweb.com/archives/glue_gets_stickier_with_conversations_and_recommen.php"> Glue Gets Stickier With Conversations and Recommendations </a> (readwriteweb.com)</li>
<li class="zemanta-article-ul-li"><a href="http://gigaom.com/2009/05/15/how-the-cloud-will-disrupt-the-it-status-quo/"> How the Cloud Will Disrupt the IT Status Quo </a> (gigaom.com)</li>
<li class="zemanta-article-ul-li"><a href="http://factoryjoe.com/blog/2009/05/18/the-open-social-web/"> The open, social web </a> (factoryjoe.com)</li>
</ul>
<div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><a class="zemanta-pixie-a" title="Reblog this post [with Zemanta]" href="http://reblog.zemanta.com/zemified/8b41164a-c578-4437-a814-d20ac1136c08/"><img class="zemanta-pixie-img" style="border: medium none; float: right;" src="http://img.zemanta.com/reblog_e.png?x-id=8b41164a-c578-4437-a814-d20ac1136c08" alt="Reblog this post [with Zemanta]" /></a><span class="zem-script pretty-attribution"><script src="http://static.zemanta.com/readside/loader.js" type="text/javascript"></script></span></div>
]]></content:encoded>
			<wfw:commentRss>http://cloudofdata.com/2009/05/opening-up-and-letting-go-to-strengthen-market-position/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

