Big data is creating a massive disruption for the IT industry. Faced with exponentially growing data volumes in every area of business and the web, companies around the world are looking beyond their current databases and data warehouses for new ways to handle this data deluge. Taking a lead from Google, a number of organizations have been exploring the potential of MapReduce, and its open source clone Hadoop, for big data processing. The MapReduce/Hadoop approach is based around the idea that what's needed is not database processing with SQL queries, but rather dataflow computing with simple parallel programming primitives such as map and reduce. As Google and others have shown, this kind of basic dataflow programming model can be implemented as a coarse-grain set of parallel tasks that can be run across hundreds or thousands of machines, to carry out large-scale bat... (more)

Cloud Analytics: Dataflow vs Databases

For twenty years, analytics has been viewed as just one specific area within the broader relational database industry. So, analytics has meant databases. Today that view is changing. Over the past year or so, a new movement, the "NoSQL" movement has emerged promoting the advantages of doing a variety of kinds of analytics without using any relational database technologies at all. Whatever one thinks of the capabilities and limitations of distributed key-value stores relative to relational databases, one thing is clear - the stranglehold that SQL has held over all aspects of data an... (more)

What's Really Industry Changing About Cloud Computing?

Bill McColl's "Cloud N" Blog This is an incredibly important time for the cloud computing area. But let’s try and move the discussion of it in the press along from an obsession with new datacenter buildings located by power stations, with the total server numbers at Microsoft and Google, and with Amazon’s hourly pricing for EC2. Interesting though those aspects of cloud computing appear to be to journalists, they hardly represent what is really industry changing about cloud computing. What are some of the new directions in the massively parallel cloud computing space? I’ll mentio... (more)

Welcome To The Realtime Intercloud

In the past it was so much easier. Search engines could crawl the web at a leisurely pace, clean up the data, build indexes, and every so often provide a new, improved search experience with more web pages covered. That was the internet, the old non-realtime internet. Today it’s different! Not only are people interested in a lot more than just web pages, they want to see everything LIVE, IN REALTIME. NO DELAYS. NO LATENCY. The solutions that were fine for the old web just don’t cut it for the realtime web. It’s the same with enterprise applications and services. In the past all ... (more)

25 Years of Big Data: From SQL To The Cloud

Cloudcel on Ulitzer Back in 1985, the world was pre-web, data volumes were small, and no one was grappling with information overload. Relational databases and the shiny new SQL query language were just about perfect for this era. At work, 100% of the data required by employees was internal business data, the data was highly structured, and was organized in simple tables. Users would pull data from the database when they realized they needed it. Fast forward to 2010. Today, everyone is grappling constantly with information overload, both in their work and in their social life. Most ... (more)