In big data computing, and more generally in all commercial highly parallel
software systems, speed matters more than just about anything else. The
reason is straightforward, and has been known for decades.
Put very simply, when it comes to massively parallel software of the kind
need to handle big data, fast is both better AND cheaper. Faster means lower
latency AND lower cost.
At first this may seem counterintuitive. A high-end sports car will be much
faster than a standard family sedan, but the family sedan may be much
cheaper. Cheaper to buy, and cheaper to run. But massively parallel software
running on commodity hardware is a quite different type of product from a
car. In general, the faster it goes, the cheaper it is to run.
Time Is Money
As has been noted many times in the history of computing, if you are a factor
of 50x slower, then you will need 50x more nod... (more)
With Dropbox, Jive, Yammer, Chatter and a number of other new services, the
modern enterprise is rapidly becoming "consumerized". And it's not just
business, the same is happening in major web companies, on Wall Street, in
government agencies, and in science labs. Thirty years of bad "enterprise
software" experiences is making this transition happen much more quickly than
anyone would have expected. The shift to cloud computing is also accelerating
the trend, as is the goal of developing a much more "social" approach to
business.
The other major change that's going on today thr... (more)
Over the next few years, the "realtime commerce" space is set to experience
phenomenal growth. Companies such as One Kings Lane, Groupon, LivingSocial,
Ideeli and others are developing a whole set of new retail business models.
Others such as Amazon and Netflix are developing ever more powerful
recommendation engines, and a huge number of companies are looking to develop
more accurate models for personalization and targeted advertising. A new
"realtime commerce" revolution is underway that will be as profound a shift
for the business world as the eCommerce revolution of the 1990s... (more)
Two weeks ago I wrote about "The Need for Speed" in cloud computing, and
asked "Who is going to build the low-latency cloud for enterprise
customers?". Today Werner Vogels and his team at Amazon announced their
Cluster Compute Instances offering.
This is a very important step forward towards the kind of realtime, high
performance cloud that customers such as Cloudscale require to deliver the
next generation of cloud services. In our case, it means we now have three
distinct alternatives for deployment of our massively parallel realtime data
warehouse architecture: standard public... (more)
Over the past few years, Hadoop has become something of a poster child for
the NoSQL movement. Whether it's interpreted as "No SQL" or "Not Only SQL",
the message has been clear, if you have big data challenges, then your
programming tool of choice should be Hadoop. Sure, continue to use SQL for
your ancient legacy stuff, but when you need cutting edge performance and
scalability, it's time to go Hadoop.
The only problem with this story is that the people who really do have
cutting edge performance and scalability requirements today have already
moved on from the Hadoop model. A ... (more)