Informatica 9.5 Supports New Generation of Big Data and Cloud Computing

Informatica has announced a major release, version 9.5, of its software platform, which will be generally available in June. The company’s data integration technologies will support the new generation of computing that includes big data, cloud computing, mobile and social media. These computing environments, which our firm has defined as key business technology drivers for this decade, have a compelling impact on the data that enterprises create and use. Being smart about integrating and utilizing significant volumes of data is essential; continuously copying and storing duplicate versions of data is not the best path forward.

I recently assessed Informatica’s business and technology strategy. It supports a range of big-data environments, including Hadoop, in-memory processing and appliances, which our benchmark research found to be top priorities of companies using the technology. Informatica helps IT organizations get to these larger volumes of data through an expanded set of data replication capabilities that can clone data for use in specialized Hadoop environments. Informatica will support in July an early release of a common visual data development environment using Hadoop technologies including HParser, PIG, HIVE, MapReduce and HDFS not just for integration with the data but also in tasks such as cleansing, parsing, transformation and identity resolution. Informatica is also partnering with Cloudera, Hortonworks and MapR to integrate with their distributions of Hadoop. As data on a large scale becomes more common in processing tasks from production to near-line access and archiving, Informatica has improved its partitioning to ensure that only what is needed is used and stored. All of these steps help ensure the proper integration of big-data technologies like Hadoop which according to our benchmark research found that 80 percent of organizations using Hadoop having confidence in supporting big data compared to 55 percent of all other approaches.

To support smarter integration of data, Informatica has improved data discovery, which now can determine physical and logical structures for supporting a variety of integration techniques. Informatica also completed the integration of data validation technology it acquired in 2010 and can process data as it is moved across development and deployment environments in this option. To ensure consistency and common identification of information, the product supports new dynamics in master data management (MDM), which include a data timeline to help users understand changes to a customer’s information over time.

Integrating data from other applications across cloud-based applications is an obstacle in 82 percent of organizations, according to our benchmark research on business data in the cloud. Informatica has expanded integrated access to cloud computing environments. It can deploy embeddable cloud services that users can select from a catalogue and configure to integrate into scheduled jobs and monitored. It recently announced the release of a cloud connectivity framework to help customers and partners develop interfaces to applications and services in cloud environments. This step was part of the Informatica Cloud Spring 2012 release, which positions the company’s software as an integration platform and as a integration as a service technology. The company says it is now averaging more than 1 billion transactions per day.

Informatica has increased its ability to mask data, model data risk, classify data through discovery and apply policies and monitor usage to ensure that proper levels of governance are applied. The federated data services in version 9.5 let analysts define and connect logical data objects to disparate sources of data. Supporting the real-time needs of an organization is part of what we call operational intelligence.

Informatica’s support for the variety in types of big data demonstrates its commitment to IT environments that deal with transactional and event-oriented data, which are becoming just as important as traditional business dataand for the data created in business transactions. Informatica also helps analysts who engage in data-related tasks from access to cleansing by providing new tools, including the Data Integration Analyst Option. This is critical, as our business analytics benchmark research found that analysts spend two-thirds of their analytic time in data-related tasks and any help for them is very welcome.

Informatica and Kapow Software have partnered together to extend its integration to the Web and semi-structured content. I recently assessed them and this integration is a critical component to get information from an organization’s cloud computing and web applications technology environment.

Informatica’s smart use of data integration technology can help IT organizations reduce their operating and technology costs. Informatica’s expansion of data integration technologies through acquisitions to include governance, quality, virtualization and master data management was a smart move. Our benchmark research on trends in information management found many organizations expecting to initiate or plan projects for an expanding set of these technologies in the coming 18 months. Informatica’s products address a broad span of data needs inside and outside of the enterprise. Release 9.5 supports a range of big-data needs and helps ensure that Informatica can compete in the new wave of investments focused on increasing the usefulness of data from cloud computing to big data.


Mark Smith – CEO & Chief Research Officer

Pentaho Business Analytics Brings Visual Discovery and More Big Data Support

With the release of Business Analytics version 4.5, Pentaho has expanded its platform and tools to address the needs of business and IT. The product has come a long way since the version 4 release less than a year ago, which broke ground in ease of use and support for big-data sources. Advancing beyond its roots in business intelligence, Pentaho Business Analytics 4.5 addresses data discovery, data integration and data mining and provides visual discovery and analytics that operate against stores of big data.

New data discovery features in version 4.5 include a group of interactive visualizations for geo-mapping, heat grids, and scatter and bubble charts. These visualizations are starting points for navigating into usually big amounts of data. Significant improvements to data caching have made visual discovery very responsive even when spanning through big data. You can easily navigate into the visualizations through simple selections or interactions on it. I like the way Pentaho has added geographic visualization and location intelligence to business analytics and its ability to add geographic layers from Google to help better understand the context of location. Also, by adding visualizations within tables and reports of business facts and figures, Pentaho makes it easier to pinpoint over- and underperformers. Visual discovery is useful for analysts who are tired of tools that do not provide enough interactive visualization and analytics, including Microsoft Excel.

Pentaho has expanded its support for non-SQL environments such as Apache Cassandra, DataStax and MongoDB through read and write interfaces for reporting and analytics. In version 4.5 Pentaho expands its existing support for Hadoop. The software can now be more easily deployed across Hadoop clusters, and supports secure Hadoop clusters. Pentaho recently announced support for MapR and now adds support for Apache, Cloudera and Hortonworks. The Pentaho MapReduce visual designer is easier to use with Pentaho Data Integration (Kettle). Users can visually access and integrate big-data sources and others through Pentaho Data Integration’s workflow and rules. This ease of use is essential, as our benchmark research on big data found that usability has the highest level of importance for evaluating vendors and their products in 69 percent of organizations.

Pentaho also recently released a new data quality product; the consistency and quality of data are even more critical as the volumes and velocity of data increase. Our recent benchmark research on information management trends found that organizations that utilize data quality software trust their business facts and figures almost 25 percent more than those that do not. Data quality and data integration are two of the key components of information management according to our benchmark, and having them work together for business analytics is critical to improving the data pipeline for analysts. I expect that Pentaho will offer more direct and even virtualized access to big data without having to integrate data from Hadoop and other sources into a relational database for analysis.

Pentaho also licenses its products to other software providers to embed in their own. As part of this effort, in version 4.5 it has added more flexibility for partners to add visualizations and data sources through its interface and scripting. Embedding business analytics as part of applications helps broaden use of the technology.

You can freely download the open source and trial versions of Pentaho’s products, and the company says it gets a download every 30 seconds. I would like Pentaho to advance further its collaboration and search capabilities to make the analytics more business-driven. I also wish tablet users could access the mobile capabilities in the latest version through a single link from its website.

Pentaho Business Analytics 4.5 brings together support for discovery and navigation of data, and with Pentaho Data Integration and Data Quality addresses the top obstacle we found in our business analytics benchmark research, that two-thirds of the analytics process is spent on data-related tasks. Its expanding support for Hadoop is critical, as our benchmark on Hadoop and information management found that Hadoop projects require significantly more data integration and visualization than non-Hadoop environments. This new release helps business and IT work together. Users can massage data and perform analysis with an integrated set of products from a single vendor, which our research finds less than one in five organizations do today. If you have not taken a look at Pentaho, investigate this version, as it is a great example of business intelligence growing into business analytics.


Mark Smith – CEO & Chief Research Officer