Two weeks ago I wrote about "The Need for Speed" in cloud computing, and
asked "Who is going to build the low-latency cloud for enterprise
customers?". Today Werner Vogels and his team at Amazon announced their
Cluster Compute Instances offering.
This is a very important step forward towards the kind of realtime, high
performance cloud that customers such as Cloudscale require to deliver the
next generation of cloud services. In our case, it means we now have three
distinct alternatives for deployment of our massively parallel realtime data
warehouse architecture: standard public cloud (EC2+S3+...), in-house cluster
(Eucalyptus or native), and now high-performance public cloud cluster.
As a former parallel supercomputing researcher, turned realtime analytics
guy, I'm excited and impressed by what Werner and his team are opening up
today. Just as commodity clusters hav... (more)
Big data is creating a massive disruption for the IT industry. Faced with
exponentially growing data volumes in every area of business and the web,
companies around the world are looking beyond their current databases and
data warehouses for new ways to handle this data deluge.
Taking a lead from Google, a number of organizations have been exploring the
potential of MapReduce, and its open source clone Hadoop, for big data
processing. The MapReduce/Hadoop approach is based around the idea that
what's needed is not database processing with SQL queries, but rather
dataflow computin... (more)
Bill McColl's "Cloud N" Blog
This is an incredibly important time for the cloud computing area. But
let’s try and move the discussion of it in the press along from an
obsession with new datacenter buildings located by power stations, with the
total server numbers at Microsoft and Google, and with Amazon’s hourly
pricing for EC2. Interesting though those aspects of cloud computing appear
to be to journalists, they hardly represent what is really industry changing
about cloud computing.
What are some of the new directions in the massively parallel cloud computing
space? I’ll mentio... (more)
Virtualization Track at Cloud Expo
SQL was the first-generation Big Data tool, and MapReduce/Hadoop was the
second-generation tool. Unfortunately, neither of these tools has the
characteristics required to break into the mainstream of data analytics,
where there are now over 100 million business professionals (non-programmers)
grappling with exponentially growing data volumes that they simply can't
handle.
However, a new third generation of tools for Big Data is now emerging that
offer the scalability, parallelism, performance and data flexibility of tools
like Hadoop, but, unlik... (more)
The intercloud turns computing inside out. With traditional IT, we move the
data to where the computing infrastructure is located. With the data volumes
in most application areas now growing exponentially, this IT model is now
broken. Moving massive volumes of data around means more bandwidth, more
storage, and more latency. We need instead to position the computing
infrastructure next to where the data is located. With intercloud computing,
we can build global apps and services where a single app can operate on data
that may be spread across many public clouds and private datace... (more)