I stopped into the Yahoo Hadoop Summit (Twitter: #HadoopSummit) to see how far the open source Hadoop technology has progressed. This open source community has been advancing for years with support from Internet titans like Yahoo, eBay and Facebook. Hadoop, as my colleague David Menninger has written, is now ready to play a large role as organizations try to cope with data on a large scale and solidify their information management agenda.

With over 1,600 attendees the Yahoo Hadoop Summit was quite vibrant. A highlight was Yahoo’s announcement that with help from Benchmark Capital it has spun out its Hadoop efforts into a separate organization called Hortonworks that David assessed. I have to say that the launch of Hortonworks left a lot to be desired. If I had to point out the weakest link of the new organization is its marketing and ability to act like an elephant in the room and blow’s its trunk. For example the launch of HortonWorks at the conference was not very impactful and its presence at the event in regards to demonstrating what they provide with their software was only available in specific sessions. I got some indicators that marketing might be improving but clearly the elephant has left the circus and heading down the freeways that now must be better presented to the market.

This is only the latest in a series of corporate moves around Hadoop as software providers position themselves to support the information technology ecosystem around big data. For example, check out these ones we have analyzed: EMC, Pentaho, Informatica and even IBM. Smaller providers like Cloudera are making Hadoop safer for enterprise use via commercially supported versions. At the Yahoo Hadoop Summit a range of providers demonstrated capabilities, including MapR for administration and Datameer, Karmasphere and Zettaset for analytics. For Hadoop to be part of the new information technology fabric, integration is needed, and Pervasive and Syncsort are addressing this opportunity. These and more than a dozen other providers showed Hadoop-related integration; Asterdata (now owned by Teradata) demonstrated MapReduce and SQL integration[JB1]  under a newly awarded patent for SQL-to-MapReduce techniques.

Hadoop is represented graphically by an elephant, and that reminds me that its evolving information technology framework is capable of challenging a true industry behemoth, Oracle. If Hadoop was not available today, organizations that use it might be considering Oracle technology to manage content and data in large, distributed clusters. Our nearly complete research on Hadoop and Information Management that we announced recently will be released in full soon, and it has some valuable insights on Hadoop use and organizations’ intentions that could impact Oracle’s position in this part of the IT industry. Hadoop may obviate the need to purchase Oracle for many types of deployments and thus could get in the way of Oracle advancing its RDBMS and middleware dominance. It is clear now that Oracle has made sure to minimize the open source aspects of mySQL after buying it along with Sun Microsystems and has quietly removed it from any significant role in the industry. It won’t be able to do that with Hadoop, at least in the short term, since it has broad support in the open source movement.

Oracle hasn’t said much publicly about Hadoop, which threatens future licensing of its own technologies. A quick search for Hadoop on the Oracle website brings up only 13 related links, but digging into its Oracle Technology Network reveals a large amount of supporting resources demonstrating the challenges it poses for the database titan. If Oracle has a strategy for combating Hadoop, it has not communicated it yet; now that I think about it, Oracle has become pretty closed off when it comes to engaging the industry lately.

Our new research finds a surging level of confidence on the part of Hadoop users to meet big-data needs instead of using an RDBMS. Few open source software developments have had this strong a potential impact on the largest software companies and conventional databases, and none of them could impact Oracle’s future as severely as Hadoop. It will be interesting to see how this plays out in the offices of CIOs and information management professionals; the relevance and importance of these technologies and providers promises a new battle in the software war that will be worth more than peanuts.

David Menninger, our VP and Research Director for IT (Twitter: @dmenningervr), will soon unveil the first in-depth Hadoop and Information Management research, and I’m sure that it will make everyone rethink their strategy not just for storing big data but for how they analyze and use it. Please register for our webinar and come learn how Hadoop could be the right tool to help you manage masses of information, content and data in your enterprise and potentially lower the costs of doing so. For me it is definitely worth buying a bag of peanuts to watch the elephants face off in the market.

Regards,

Mark Smith – Chief Research Officer