16 Top Big Data Analytics Platforms

Data analysis is a do-or-die requirement for today’s businesses. We analyze notable vendor choices, from Hadoop upstarts to conventional database players.

Revolutionary. That almost always describes the knowledge analysis time by which we are living. Businesses grapple with huge quantities and sorts of data on one hand, and ever-faster expectations for analysis at the other. The seller community is responding by providing highly distributed architectures and new levels of memory and processing power. Upstarts also exploit the open-source licensing model, which isn’t new, but is increasingly accepted or even sought out by data-management professionals.

Apache Hadoop, a nine-year-old open-source data-processing platform first utilized by Internet giants including Yahoo and Facebook, leads the enormous-data revolution. Cloudera introduced commercial support for enterprises in 2008, and MapR and Hortonworks piled on in 2009 and 2011, respectively. Among data-management incumbents, IBM and EMC-spinout Pivotal each has introduced its own Hadoop distribution. Microsoft and Teradata offer complementary software and primary-line support for Hortonworks’ platform. Oracle resells and supports Cloudera, while HP, SAP, and others act more like Switzerland, working with multiple Hadoop software providers.

In-memory analysis gains steam as Moore’s Law brings us faster, cheaper, and more-memory-rich processors. SAP have been the largest champion of the in-memory approach with its Hana platform, but Microsoft and Oracle at the moment are poised to introduce in-memory options for his or her flagship databases. Focused analytical database vendors including Actian, HP Vertica, Kognitio, and Teradata have introduced options for prime-RAM-to-disk ratios, together with tools to position specific data into memory for ultra-fast analysis.

Advances in bandwidth, memory, and processing power even have improved real-time stream-processing and stream-analysis capabilities, but this technology has yet to determine broad adoption. Several vendors here complex event processing, but outside of the financial trading, national intelligence, and security communities, deployments have been rare. Watch this space and, particularly, new open source options as breakthrough applications in ad delivery, content personalization, logistics, and other areas push broader adoption.

Our slideshow includes broad-based data-management vendors — IBM, Microsoft, Oracle, SAP — that provide everything from data-integration software and database-management systems (DBMSs) to business intelligence and analytics software, to in-memory, stream-processing, and Hadoop options. Teradata is a blue chip focused more narrowly on data management, and prefer Pivotal, it has close ties with analytics market leader SAS.

Plenty of vendors covered here offer cloud options, but 1010data and Amazon Web Services (AWS) have staked their entire businesses at the cloud model. Amazon has the broadest number of products of both, and it’s an obvious choice for those running big workloads and storing numerous data at the AWS platform. 1010data has a highly scalable database service and supporting information-management, BI, and analytics capabilities which are served up private-cloud style.

The jury remains to be out on whether Hadoop turns into as indispensable as database management systems. Where volume and diversity are extreme, Hadoop has proven its utility and price advantages. Cloudera, Hortonworks, and MapR are doing everything they are able to to transport Hadoop beyond high-scale storage and MapReduce processing into the area of analytics.

The niche vendors here include Actian, InfiniDB/Calpont, HP Vertica, Infobright, and Kognitio, all of that have centered their big-data stories around database management systems focused entirely on analytics instead of transaction processing. German DBMS vendor Exasol is another niche player on this mold, but we do not cover it here as its customer base is sort of entirely in continental Europe. It opened offices within the U.S. and U.K. in January 2014.

This collection doesn’t cover analytics vendors, comparable to Alpine Data Labs, Revolution Analytics, and SAS. These vendors invariably work at the side of platforms provided by third-party DBMS vendors and Hadoop distributors, although SAS specially is blurring this line with growing support for SAS-managed in-memory data grids and Hadoop environments. We also excluded NoSQL and NewSQL DBMSs, which might be heavily (though not entirely) concentrated on high-scale transaction processing, not analytics. We plan to hide NoSQL and NewSQL platforms in a separate, soon-to-be-published collection.

Now dig in and learn more about these analytics vendors and the way they compare.