You are currently browsing the tag archive for the ‘Information Management’ tag.
Big data has become a big deal as the technology industry has invested tens of billions of dollars to create the next generation of databases and data processing. After the accompanying flood of new categories and marketing terminology from vendors, most in the IT community are now beginning to understand the potential of big data. Ventana Research thoroughly covered the evolving state of the big data and information optimization sector in 2014 and will continue this research in 2015 and beyond. As it progresses the importance of making big data systems interoperate with existing enterprise and information architecture along with digital transformation strategiesbecomes critical. Done properly companies can take advantage of big data innovations to optimize their established business processes and execute new business strategies. But just deploying big data and applying analytics to understand it is just the beginning. Innovative organizations must go beyond the usual exploratory and root-cause analyses through applied analytic discovery and other techniques. This of course requires them to develop competencies in information management for big data.
Among big data technologies, the open source Hadoop has been commercialized by now established providers including Cloudera, Hortonworks and MapR and made available in the cloud through platforms such as Qubole, which received a Ventana Research Technology Innovation Award in 2014. Other big data technologies are growing as well; for example, use of in-memory and specialized databases also is growing like Hadoop in more than 40 percent of organizations, according to our big data integration benchmark research. These technologies have been integrated into databases or what I call hybrid big data appliances like those from IBM, Oracle, SAP and Teradata that bring the power of Hadoop to the RDBMS and exploit in-memory processing to perform ever faster computing. When placed into hosted and cloud environments these appliances can virtualize big data processing. Another new provider, Splice Machine, brings the power of SQL processing in a scalable approach that uses Hadoop in a cloud-based approach; it received a Ventana Research Technology Leadership Award last year. Likewise advances in NoSQL approaches help organizations process and utilize semistructured information along with other information and blend them with analytics as Datawatch does. These examples show that disruptive technologies still have the potential to revolutionize our approaches to managing information.
Our firm also explores what we call information optimization, which assesses techniques for gaining full value from business information. Big data is one of these when used effectively in an enterprise information architecture. In this context the “data lake” analogy is not helpful in representing the full scope of big data, suggesting simply a container like a data marts or data warehouse. With big data, taking an architectural approach is critical. This viewpoint is evident in our 2014 Ventana Research Technology Innovation Award in Information Management to Teradata for its Unified Data Architecture. Another award winner, Software AG, blends big data and information optimization using its real-time and in-memory processing technologies.
Businesses need to process data in rapid cycles, many in real time and what we call operational intelligence, which utilizes events and streams and provides the ability to sense and respond immediately to issues and opportunities in organizations that adapt to a data-driven culture. Our operational intelligence research finds that monitoring, alerting and notification are the top use cases for deployment, in more than half of organizations. Also machine data can help businesses optimize not just IT processes but business processes that help govern and control the security of data in the enterprise. This imperative is evident in the dramatic growth of suppliers such as Splunk, Sumo Logic and Savi Technology, all of which won Ventana Research Technology Innovation awards for how they process machine and business data in large volumes at rapid velocity.
Another increasing trend in big data is presenting it in ways that ordinary users can understand quickly. Discovery and advanced visualization is not enough for business users who are not trained to interpret these presentations. Some vendors can present locationand geospatial data on maps that are easier to understand. At the other end of the user spectrum data scientists and analysts need more robust analytic and discovery tools, including predictive analytics, which is a priority for many organizations, according toour big data analytics research. In 2015 we will examine the next generation of predictive analytics in new benchmark research. But there is more work to do to present insights from information that are easy to understand. Some analytics vendors are telling stories by linking pages of content, but these narratives don’t as yet help individuals assess and act. Most analytics tools can’t match the simple functionality of Microsoft PowerPoint, placing descriptive titles, bullets and recommendations on a page with a graphic that represents something important to these business professional who reads it. Deeper insights may come from advances in machine learning and cognitive computing that have arrived on the market and bring more science to analytics.
So we strong potential for the outputs of big data, but they don’t arrive just by loading data into these new computing environments. Pragmatic and experienced professionals realize that information management processes do not disappear. A key one in this area is data preparation, which helps ready data sets for processing into big data environments. Preparing data is the second-most important task for 46 percent of organizations in our big data integration research. A second is data integration, which some new tools can automate. This can enable lines of business and IT to work together on big data integration, as 41 percent of organizations in our research are planning to do. To address this need a new generation of technologies came into their own in 2014 including those that received Ventana Research Technology Innovation Awards like Paxata and Tamr but also Trifacta.
Yet another area to watch is the convergence of big data and cloud computing. The proliferation of data sources in the cloud forces organizations to managed and integrate data from a variety of cloud and Internet sources, hence the rise of information as a service for business needs. Ventana Research Technology Innovation Award winner DataSift provides information as a service to blend social media data with other big data and analytics. Such techniques require more flexible environments for integration that can operate anywhere at any time. Dell Boomi, MuleSoft, SnapLogic and others now challenge established data integration providers such as Informatica and others including IBM, Oracle and SAP. Advances in master data management, data governance, data quality and integration backbones, and Informatica and Information Builders help provide better consistency of any type of big data for any business purpose. In addition our research finds that data security is critical for big data in 61 percent of organizations; only 14 percent said that is very adequate in their organization.
There is no doubt that big data is now widespread; almost 80 percent of organizations in our information optimization research, for example, will be using it some form by the end of 2015. This is partly due to increased use across the lines of business; our research on next-generation customer analytics in 2014 shows that it is important to improving understanding customers in 60 percent of organizations, is being used in one-fifth of organizations and will be in 46 percent by the end of this year. Similarly our next-generation finance analytics research in 2014 finds big data important to 37 percent of organizations, with 13 percent using it today and 42 percent planning to by the end of 2015. And we have already measured how it will impact human capital management and HR and where organizations are leveraging it in this area of importance.
I invite you to download and peruse our big data agenda for 2015. We will examine how organizations can instrument information optimization processes that use big data and pass this guidance along. We will explore big data’s role in sales and product areas and produce new research on data and analytics in the cloud. Our research will uncover best practices that innovative organizations use not only to prepare and integrate big data but also more tightly unify it with analytics and operations across enterprise and cloud computing environments. For many organizations taking on this challenge and seeking its benefits will require new information platforms and methods to access and provide information as part of their big data deployments. (Getting consistent information across the enterprise is the top benefit of big data integration according to 39 percent of organizations.) We expect 2015 to be a big year for big data and information optimization. I look forward to providing more insights and information about big data and helping everyone get the most from their time and investments in it.
CEO and Chief Research Officer
The market for big data continues to grow as organizations try to extract business value from their own masses of data and other sources. Earlier this year I outlined the dynamics of the business opportunity for big data and information optimization. We continue to see advances as big data and associated information technologies deliver more value, but the range of innovation also has created fragmentation among existing systems including databases that are managed onpremises or in cloud computing environments. In this changing environment organizations encounter new challenges not only in adapting to technology that is more efficient in automating data processing but also in integrating it into their enterprise architecture. I’ve already explained how big data can be ineffective without integration, and we conducted more in-depth research into the market, resulting in our benchmark research on big data integration, which reveals the state of how organizations are adopting this technology in their processes.
The research shows that use of big data techniques has become widespread: Almost half (48%) of all organizations participating in this research and two-thirds of the very large ones use it for storage, and 45 percent intend to use big data in the next year or sometime in the future. This is a significant change in that most organizations have used relational database management systems (RDBMSs) for nearly everything. We find that RDBMSs (76%) are still the most widely used big data technology, followed by flat files (61%) and data warehouse appliances (46%). But this is not the direction many companies are planning to take in the future: Hadoop (44%), in-memory database (46%), specialized databases (43%) and NoSQL (42%) are the tools most often planned to be used by 2016 or being evaluated. Clearly there is a revolution in approaches to storing and using data, and that introduces both opportunities and challenges.
Establishing a big data environment requires integrating data through proper preparation and potentially continuous updates of data, whether in real time or batch processing. A further complication is that many organizations will not have only one but several big data environments to be integrated into the overall enterprise architecture; that requires data and systems integration. Our research finds that some organizations are aware of this issue: Automating big data integration is very important to 45 percent and important to more than one-third. Automation can not only bring efficiency to big data but also remove many risks of errors or inaccurate and inconsistent data.
Data integration technologies have evolved over the past decade, but advances to support big data are more recent. Our research shows a disparity in how well organizations handle big data integration tasks. Those that are mostly or completely adequate are accessing (for 63%), loading (60%), extracting (59%), archiving (55%) and copying (52%) data while the areas most in need of improvement are virtualizing (39%), profiling (37%), blending (34%), master data management (33%) and masking for privacy (33%). At the system level, the research finds that conventional enterprise capabilities are most often needed: load balancing (cited by 51%), cross-platform support (47%), a development and testing environment (42%), systems management (40%) and scalable execution of tasks (39%). To test the range of big data integration capabilities before it is applied to production projects, the “sandbox” has become the standard approach. For their development and testing environment, the largest percentage (36%) said they will use an internal sandbox with specialized big data. This group of findings reveals that big data integration has enterprise-level requirements that go beyond just loading data to build on advances in data integration.
Big data must not be a separate store of data but part of the overall enterprise and data architecture; that is necessary to ensure full integration and use of the data. Organizations that see data integration as critical to big data are embarking on sophisticated efforts to achieve it. The data integration capabilities most critical to their big data efforts are to develop and manage metadata that can be shared across BI systems (cited by 58%), to join disparate data sources during transformation (56%) and to establish rules for processing and routing data (56%).
Other organizations are still examining how to automate integration tasks. The most common barriers to improving big data integration are cost of the software or license (for 44%), lack of resources to use on improvement (37%) and the sense that big data technologies are too complicated to integrate (35%). These findings demonstrate that many organizations need to better understand the efficiency and cost savings that can be realized by using purpose-built technology instead of manual approaches using tools not designed for big data. Along with identifying solid business benefits, establishing savings of time and money are essential pieces of a convincing rationale for investment in big data integration technology. The most time spent in big data integration today is on basic tasks: reviewing data for quality and consistency (52%), preparing data for integration (46%) and connecting to data sources for integration (39%). The first two are related to ensuring that data is ready to load into big data environments. Data preparation is a key part of big data and overall information optimization. More vendors are developing dedicated technology to help with it.
For a process as complex as big data integration, choosing the right technology tool can be difficult. More than half (55%) of organizations are planning to change the way they assess and select such technology. Evaluations of big data integration tools should include considerations of how to deploy it and what sort of vendors can provide it. Almost half (46%) of organizations prefer to integrate big data on-premises while 28 percent opt for cloud-based software as a service and 17 percent have no preference. Half of organizations plan to use cloud computing for managing big data; another one-third (32%) don’t know whether they will. The research shows that the most important technology and vendor criteria used to evaluate big data integration technology are usability (very important for 53%), reliability (52%) and functionality (49%). These top three evaluation criteria are followed by manageability, TCO/ROI, adaptability and validation of vendors. Organizations are most concerned to have technology that is easy to use and can scale to meet their needs.
Big data cannot be used effectively without integration; we observe that the big data industry has not paid as much attention to information management as it should – after all, this is what enables automating the flow of data. Organizations trying to use big data without a focus on information management will have difficulty in optimizing the use of their data assets for business needs. Our research into big data integration finds that the proper technology is critical to meet these needs. We also learned from our benchmark research into big data analytics that data preparation is the largest and most time-consuming set of tasks that needs to be streamlined for best use of the analytics that reveal actionable insights. Organizations that are initiating or expanding their big data deployments whether onpremises or within cloud computing environments should have integration at the top of their priority list to ensure they do not create silos of data that they can’t fully exploit.
CEO and Chief Research Officer