Teradata Aster Standardizes Access to Hadoop with SQL-H

Using Hadoop just got easier, thanks to Teradata’s introduction of SQL-H, a new query interface to analyze data from Hadoop.  Most Hadoop access methods require preprocessing and staging of data from the Hadoop Distributed File System (HDFS) using technologies such as MapReduce. These approaches require new skills and technologies, introducing more time and costs for users, which offset the benefits of Hadoop, which according to our big data benchmark research include increasing the speed of analysis. Teradata has announced support for SQL-H not only for its own Aster Database 5.0, which it expects to release in the third quarter, but also supporting the commercial version of Hadoop through Hortonworks.

Use of a familiar query interface, by contrast, reduces staffing and training issues required for learning more Hadoop-specific interfaces, which our research found to be the top two obstacles to big data analytics. Teradata Aster accomplishes this through utilizing use of HCatalog to get access to metadata that can be queried against using Aster SQL, ODBC, JDBC and ultimately any analytics or business intelligence tool, since the data then looks like a database table structure. The need to extract and store data from Hadoop into other database systems and thereby lose the computing power of Hadoop has been the Achilles heel of this big data technology. Analysts who want interactive and iterative discovery of their data now do not have to depend on the Hadoop Hive query language interface and can use more familiar tools like MicroStrategy and Tableau for analytics. Teradata Aster incorporates derivative analytics in its technology to be applied to data in Hadoop, including the customer and transaction data that, according to our research, top the list of types used. Its capabilities include analytics around paths, text, statistics, segmentation and broader customer interaction.

Teradata Aster has an advantage over EMC Greenplum, IBM and Oracle, which do not provide this level of direct integration with Hadoop today. Their approach requires data duplication and does not leverage the extended power of Hadoop and use of HCatalog for metadata knowledge about the data itself. I expect that if other vendors want to exploit the power of Hadoop they will need to expand their support of it over the coming year.

The introduction of SQL-H in Teradata Aster helps analysts streamline their analytics while reducing the custom coding and development required from IT staffers. Utilizing the Aster platform provides other computational processing advantages in its scale-out approach using a range of server technologies. According to our research, one-third of organizations plan to use Hadoop. For Teradata Aster, support for Hadoop builds on its existing big data support. Organizations looking to further exploit Hadoop  to analyze large volumes of data quickly should find Teradata Aster SQL-H a welcome advancement for their data and analytic options.

Regards,

Mark Smith – CEO & Chief Research Officer

10 thoughts on “Teradata Aster Standardizes Access to Hadoop with SQL-H

  1. Mark,

    I thought I should correct your comment about EMC Greenplum requiring duplication of data in order to access it with SQL. Greenplum supports direct access to Hadoop data via external tables without requiring any duplication of data. This functionality as been available for over a year beginning with Version 4.1 of the Greenplum Database.

    Dave
    EMC Greenplum

  2. Thanks a lot for your excellent summary of the benefits of SQL-H. However, I would like to add that, while you mention customer and transaction data in particular, I see even more benefits in analyzing multi-structured data. After all, companies like Amazon, eBay, Facebook and Twitter – to name but a few – use HDFS to store huge amounts of data, mainly texts, pictures and other interactions. Enabling access to HDFS using standard SQL will increase the number of users that can take advantage of such multi-structured data substantially, as was pointed out by one of my colleagues: http://blogs.teradata.com/emea/Democratizing-big-data/ I am quite sure that, with a growing amount of users, new possibilities for using data stored in HDFS will soon open up and I would be looking forward to continuing the discussion on this topic.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s