In his keynote at Spark Summit 2014 in San Francisco today, Databricks CEO Ion Stoica unveiled Databricks Cloud, a cloud platform built around the Apache Spark open source processing engine for big data.
Spark, which got its v 1.0 release just one month ago, is a cluster computing framework designed to sit on top of Hadoop Distributed File System (HDFS) in place of Hadoop MapReduce. With support for in-memory cluster computing, Spark can achieve performance up to 100x faster than Hadoop MapReduce in memory or 10x faster on disk.
Spark can be an excellent compute engine for data processing workflows, advanced analytics, stream processing and business intelligence/visual analytics. But Spark clusters can be difficult beasts, Stoica says. Databricks hopes to change all that with its hosted Databricks Cloud platform as a turnkey solution.