You are currently browsing the tag archive for the ‘Data Integration’ tag.
Big data has great promise for many organizations today, but they also need technology to facilitate integration of various data stores, as I recently pointed out. Our big data integration benchmark research makes it clear that organizations are aware of the need to integrate big data, but most have yet to address it: In this area our Performance Index analysis, which assesses competency and maturity of organizations, concludes that only 13 percent reach the highest of four levels, Innovative. Furthermore, while many organizations are sophisticated in dealing with the information, they are less able to handle the people-related areas, lacking the right level of training in the skills required to integrate big data. Most said that the training they provide is only somewhat adequate or inadequate.
Big data is still new to many organizations, and they face challenges in integrating big data that prevent them from gaining full value from their existing and potential investments. Our research finds that many lack confidence in processing large volumes of data. More than half (55%) of organizations characterized themselves as only somewhat confident or not confident in their ability to accomplish that task. They have even less confidence in their ability to process data that arrives at high velocity: Only 29 percent said they are somewhat confident or not confident in that. In dealing with the variety of big data, confidence is somewhat stronger, as more than half (56%) declared themselves confident or very confident. Assurance in one aspect is often found in others: 86 percent of organizations that said they are very confident in their ability to integrate the variety of big data are satisfied with how they manage the storage of big data. Similarly 91 percent of those that are confident or very confident with their data quality are satisfied with the way they manage the storage of big data.
Turning to the technology being used, we find only one-third (32%) of organizations satisfied with their current data integration technology, but twice as many (66%) are satisfied with their data integration processes for loading and creating big data. A substantial majority (86%) of those very confident in their ability to integrate the needed variety of big data are satisfied with their existing data integration processes. Those that are not satisfied said the process is too slow (61%), analytics are hard to build and maintain (50%) and data is not readily available (39%). These findings indicate that making a commitment to data integration, for big data and otherwise, can pay off in confidence and satisfaction with the processes for doing it. Additionally, organizations that use dedicated data integration technology (86%) are satisfied much more often than those that don’t use dedicated technology (52%).
New types of big data technologies are being introduced to meet expanding demand for storage and use of information across the enterprise. One of those fast-growing technologies is the open source Apache Hadoop and commercial enterprise versions of it that provide a distributed file system to manage large volumes of data. The research finds that currently 28 percent of organizations use Hadoop and about as many more (25%) plan to use it in the next two years. Nearly half (47%) have Hadoop-specific skills to support big data integration. For those that have limited resources, open source Hadoop can be affordable, and to automate and interface with it, adopters can use SQL in addition to its native interfaces; about three in five organizations now use each of these options. Hadoop can be a capable tool to implement big data but must be integrated with other information and operational systems.
Big data is not found only in conventional in-house information environments. Our research finds that data integration processes are most often applied between systems deployed on-premises (58%), but more than one-third (35%) are integrating cloud-based systems, which reflects the progress cloud computing has made. Nonetheless, cloud-to-cloud integration remains least common (18%). In the next year or two 20 to 25 percent of organizations plan additional support for all types of integration; those being considered most often are cloud-to-cloud (25%) and on-premises-to-cloud (23%), further reflecting movement into the cloud. In addition, nearly all (95%) organizations using cloud-to-cloud integration said they have improved their activities and processes. This finding confirms the value of integration of big data regardless of what types of systems hold it. With a growing number of organizations using cloud computing, data integration is a critical requirement for big data projects; more than one-quarter (28%) of organizations are deploying big data integration into cloud computing environments.
Because of the intense need of business units and process for big data, integration requires IT and business people to work together to build efficient processes. The largest percentage of organizations in the research (44%) have business analysts work with IT to design and deploy big data integration. Another one-third assign IT to build the integration, and half that many (16%) have IT use a dedicated data integration tool. The research finds some distrust in involving the business side. Almost one in four (23%) said they are resistant or very resistant to allowing business users to integrate big data that IT has not prepared first, and the majority (51%) resist somewhat. For more than half (58%) the IT group responsible for BI and data warehouse systems also is the key stakeholder for designing and deploying big data integration; no other option is used by more than 11 percent.
It is not surprising that IT is the department that most often facilitates big data and needs integration the most (55%). The most frequent issue arising between business units and IT is entrenchment of budgets and priorities (in 42% of organizations). Funding of big data initiatives most often comes from the general IT budget (50%); line-of-business IT budgets (38%) are the second-most commonly used. It is understandable that IT dominates this heavily technical function, but big data is beneficial only when it advances the organization’s goals for information that is needed by business. Management should ensure that IT works with the lines of business to enable them to get the information they need to improve business processes and decision-making and not settle for creating a more cost-effective and efficient method to store it.
Overcoming these challenges is a critical step in the planning process for big data. My analysis that big data won’t work well without integration is confirmed by the research. We urge organizations to take a comprehensive approach to big data and evaluate dedicated tools that can mitigate risks that others have already encountered.
CEO and Chief Research Officer
At the Informatica World 2014 conference, the company known for its data integration software unveiled the Intelligent Data Platform. In the last three years Informatica has expanded beyond data integration and now has a broad software portfolio that facilitates information management within the enterprise and through cloud computing. The Intelligent Data Platform forms a framework for its portfolio. This expression of broad potential is important for Informatica, which has been slow to position its products as capable of more than data integration. A large part of the value it provides lies in what its products can do to help organizations strengthen their enterprise architectures for managing applications and data. We see Informatica’s sweet spot in facilitating efficient use of data for business and IT purposes; we call this information optimization.
Informatica’s Intelligent Data Platform is built in three layers. The bottom layer is Informatica Vibe, the virtual data machine that I covered at its launch last year. Informatica Vibe won our Ventana Research 2013 Technology Innovation Award for information optimization. It virtualizes information management technology to operate on any platform whether on-premises or in any form of cloud computing.
Above Informatica Vibe in the platform is a data infrastructure layer, which contains all the technologies that act upon data, from integration through archiving, masking, mastering, quality assurance, security, streaming and other tasks. At the core of this second layer is Informatica PowerCenter, which provides data integration and other capabilities central to processing of data into information. PowerCenter provides parsing, profiling, joining and filtering but also is integral for data services through Informatica’s Data Integration Hub that operates in a publish-and-subscribe model. The latest PowerCenter release, version 9.6, focuses on providing agility in development and provides a series of packaged editions that provide certain levels of functionality; users choose among them to fit their requirements. This developer support includes advances in test data management and data masking for enterprise-class needs. There are editions for Informatica Data Quality, too. The latest release of Informatica MDM, 9.7, improves the user experience for data stewards along with enhanced performance and governance. Not much was mentioned at the conference about Informatica’s Product Information Management (PIM) offering that our most recent Value Index vendor and product assessment rated Hot.
The third layer is data intelligence. Here Informatica has added capabilities to organize, infer and recommend action from data and to provision and map data to business needs. In addition Informatica’s Business Glossary and Metadata Manager help establish consistent definitions and use of data for operational or analytical tasks. Informatica RulePoint, a product that also was not mentioned much at the conference, processes events through workflow in a continuous rule based manner; depending on how processing occurs, its function is to support complex event processing or event streaming.
On top of the Intelligent Data Platform, Informatica has added a couple of new innovations. Project Springbok, which is not yet released, is a tool for preparation of data for analytics and operations through its Innovation division. This new product will use Informatica’s expertise in providing access to and integration of data sources, which according to our information optimization benchmark research is the top analyst requirement in 39 percent of organizations. Despite data warehouse efforts, analysts and business users still have to access many data sources. Simplifying information is critical for nearly all organizations that have more than 16 data sources. Demonstrations showed that Springbok can dynamically create and automate the transformations that run in PowerCenter. It also offers access to a master reference to ensure that data is processed in a consistent manner. IT professionals gain visibility into what business units are doing to show how they can help in provisioning data. Even in beta release Springbok has significant potential to address the range of data issues analysts face and reduce the time they spend on data-related tasks. Our research has shown for several years that this data challenge presses organizations to diversify the tools they use, and software vendors in this market have responded. Informatica will have to compete with more than a dozen others and demonstrate its superiority for integration. Our research finds that the lines of business and IT now share responsibility for information availability in 42 percent of organizations. Informatica will have to demonstrate its value to line of business analysts who are evaluating a new generation of tools for data and analytics.
A second innovation is a new data security product called Secure@Source, also being developed in the Innovation unit, is designed to protect data assets where they are stored and processed. This product moves Informatica into the information security market segment. Secure@Source helps users discover, detect, assess and protect data assets in their persistent locations and during consumption by applications or Internet services. The question is whether Informatica can convince current customers to examine it or will have to approach information security professionals who are not users of Informatica. Security of data is among the top five required data activities according to our research and a key part of the manageability requirements that organizations find important in considering products. Informatica has an opportunity to insert itself into the dialogue in this area if it properly presents the new product to IT and business people alike.
In big data Informatica has made steady progress, but to reach its potential in this segment will require more investments in the mixed big data environments, not just Hadoop. As our research has shown for three years, customers want big data to distribute processing and integration of data across sources. Our recent research on big data analytics finds that three out of four (76%) define big data analytics as being about accessing and analyzing all sources of data. This poses a challenge for data integration, and our new research on big data integration finds that most have a long way to go in accessibility and mastering of data. Informatica begins to address this and has an opportunity in helping develop a new generation of data architecture.
In cloud computing, the company has consolidated its efforts to ensure that the cloud is part of its core technology. It released new versions for its cloud-based integration, quality, master and real-time data management products; these begin to address the challenge of process and application integration, which are important considerations for businesses in determining whether integrate or replace point cloud solutions to improve efficiency of tasks and business processes. Informatica has continued to focus on integrating mostly with the large cloud computing providers and has yet to invest in streamlining processes in particular lines of business. This has left openings for other cloud integration providers to compete, making it harder than expected for Informatica to dominate in this segment. The next step here is up to Informatica.
I believe that one of the highest potential opportunities for Informatica is in the application architectures of organizations whose business processes have been distributed through a collection of cloud-based applications that lack interconnectivity and integration. For example, finance departments often have software from different providers for budgeting and planning, consolidation and reporting, accounting and payroll management. When these applications are spread across the cloud, connecting them is a real challenge, let alone trying to get information from sales force automation and customer service applications. The implications of this are shown in our finance analytics research : Data-related tasks consume the most time and impede the efficiency of financial processes as they do in all other line of business areas that we have researched. Similar situations exist in customer-related areas (marketing, sales and customer service) and employee management processes (recruiting, onboarding, performance, compensation and learning). Informatica has made progress with Informatica Cloud Extend for interconnecting tasks across applications, which can help streamline processes. While perhaps not obvious to data integration specialists, this level of process automation and integration is essential to the future of cloud computing. Informatica also announced it will offer master data management in the cloud; this should help it not just to place a data hub in the cloud but to help companies interoperate separate cloud applications more efficiently.
Overall the Informatica Intelligent Data Platform is a good reference model for tasks related to turning data into information assets. But it could be much distinct in how its automation accelerates the processing of data faster and helps specific roles work faster and smarter. This platform does not provide a context for enterprise architectures that are stretched between on-premises and various cloud deployments. Organizations will have to determine whether Informatica’s approach fits their future data and architectural needs. As Informatica pushes its platform approach, it has to ensure it is seen as a leader in big data integration, helping business analysts with data, supporting a larger number of application sources and connecting cloud computing through unifying business applications. This won’t be easy to accomplish as Informatica has not been as progressive in the broader approach to big data and use across operations and analytics.
Informatica has been growing substantially and is getting close to US$1 billion in annual software revenue. We have recognized its success through rating it a Hot vendor in our Data Integration Value Index and naming one of its customers, the CIO of UMass Memorial Health Care, the Chief Information Officer in our 2013 Leadership Awards. Informatica has been continuing substantial investment in R&D. Its acquisitions of data-related software companies have helped it grow, and Informatica has invested to integrate the products with PowerCenter. With almost half (49%) of organizations planning to change their information availability processes, the opportunity for Informatica is significant; its challenge is to gain the confidence and recognition by business customers, who now play a larger role in the selection and purchasing of software. This will require Informatica to speak their language of business and not just technology but the business processes that they are held accountable. Informatica is a major player in information management; now it must become as significant a choice for streamlining business processes and use of applications and data across the enterprise and cloud computing to enable information optimization.
CEO & Chief Research Officer