You are currently browsing the tag archive for the ‘Big Data’ tag.
Data is an essential ingredient for every aspect of business, and those that use it well are likely to gain advantages over competitors that do not. Our benchmark research on information optimization reveals a variety of drivers for deploying information, most commonly analytics, information access, decision-making, process improvements and customer experience and satisfaction. To accomplish any of these purposes requires that data be prepared through a sequence of steps: accessing, searching, aggregating, enriching, transforming and cleaning data from different sources to create a single uniform data set. To prepare data properly, businesses need flexible tools that enable them to enrich the context of data drawn from multiple sources, collaborate on its preparation to serve business needs and govern the process of preparation to ensure security and consistency. Users of these tools range from analysts to operations professionals in the lines of business.
Data preparation efforts often encounter challenges created by the use of tools not designed for these tasks. Many of today’s analytics and business intelligence products do not provide enough flexibility, and data management tools for data integration are too complicated for analysts who need to interact ad hoc with data. Depending on IT staff to fill ad hoc requests takes far too long for the rapid pace of today’s business. Even worse, many organizations use spreadsheets because they are familiar and easy to work with. However, when it comes to data preparation, spreadsheets are awkward and time-consuming and require expertise to code them to perform these tasks. They also incur risks of errors in data and inconsistencies among disparate versions stored on individual desktops.
In effect inadequate tools waste analysts’ time, which is a scarce resource in many organizations, and can squander market opportunities through delays in preparation and unreliable data quality. Our information optimization research shows that most analysts spend the majority of their time not in actual analysis but in readying the data for analysis. More than 45 percent of their time goes to preparing data for analysis or reviewing the quality and consistency of data.
Businesses need technology tools capable of handling data preparation tasks quickly and dependably so users can be sure of data quality and concentrate on the value-adding aspects of their jobs. More than a dozen such tools designed for these tasks are on the market. The best among them are easy for analysts to use, which our research shows is critical: More than half (58%) of participants said that usability is a very important evaluation criterion, more than any other, in software for optimizing information. These tools also deal with the large numbers and types of sources organizations have accumulated: 92 percent of those in our research have 16 to 20 data sources, and 80 percent have more than 20 sources. Complicating the issue further, these sources are not all inside the enterprise; they also are found on the Internet and in cloud-based environments where data may be in applications or in big data stores.
Organizations can’t make business use of their data until it is ready, so simplifying and enhancing the data preparation process can make it possible for analysts to begin analysis sooner and thus be more productive. Our analysis of time related to data preparation finds that when this is done right, significant amounts of time could be shifted to tasks that contribute to achieving business goals. We conclude that, assuming analysts spend 20 hours a week working on analytics, most are spending six hours on preparing data, another six hours on reviewing data for quality and consistency issues, three more hours on assembling information, another two hours waiting for data from IT and one hour presenting information for review; this leaves only two hours for performing the analysis itself.
Dedicated data preparation tools provide support for key tasks in areas that our research and experience finds that are done manually by about one-third of organizations. These data tasks include search, aggregation, reduction, lineage tracking, metrics definition and collaboration. If an organization is able to reduce the 14 hours previously mentioned in data-related tasks (that including preparing data, reviewing data and waiting for data from IT) by one-third, it will have an extra four hours a week for analysis – that’s 10 percent of a 40-hour work week. Multiply this time by the number of individual analysts and it becomes significant. Using the proper tools can enable such a reallocation of time to use the professional expertise of these employees.
This savings can apply in any line of business. For example, our research into next-generation finance analytics shows that more than two-thirds (68%) of finance organizations spend most of their analytics time on data-related tasks. Further analysis shows that only 36 percent of finance organizations that spend the most time on data-related tasks can produce metrics within a week, compared to more than half (56%) of those that spend more time on analytic tasks. This difference is important to finance organizations seeking to take a more active role in corporate decision-making.
Another example is found in big data. The flood of business data has created even more challenges as the types of sources have expanded beyond just the RDBMS and data appliances; Hadoop, in-memory and NoSQL big data sources exist in at least 25 percent of organizations, according to our big data integration research. Our projections of growth based on what companies are planning indicates that Hadoop, in-memory and NoSQL sources will increase significantly. Each of these types must draw from systems from various providers, which have specific interfaces to access data let alone load it. Our research in big data finds similar results regarding data preparation: The tasks that consume the most time are reviewing data for quality and consistency (52%) and preparing data (46%). Without automating data preparation for accessing and streamlining the loading of data, big data can be an insurmountable task for companies seeking efficiency in their deployments.
A third example is in the critical area of customer analytics. Customer data is used across many departments but especially marketing, sales and customer service. Our research again finds similar issues regarding time lost to data preparation tasks. In our next-generation customer analytics benchmark research preparing data is the most time-consuming task (in 47% of organizations), followed closely by reviewing data (43%). The research also finds that data not being readily available is the most common point of dissatisfaction with customer analytics (in 63% of organizations). Our research finds other examples, too, in human resources, sales, manufacturing and the supply chain.
The good news is that these business-focused data preparation tools have usability in the form of spreadsheet-like interfaces and include analytic workflows that simplify and enhance data preparation. In searching for and profiling of data and examining fields based on analytics, use of color can help highlight patterns in the data. Capabilities for addressing duplicate and incorrect data about, for example, companies, addresses, products and locations are built in for simplicity of access and use. In addition data preparation is entering a new stage in which machine learning and pattern recognition, along with predictive analytics techniques, can help guide individuals to issues and focus their efforts on looking forward. Tools also are advancing in collaboration, helping teams of analysts work together to save time and take advantage of colleagues’ expertise and knowledge of the data, along with interfacing to IT and data management professionals. In our information optimization research collaboration is a critical technology innovation, according to more than half (51%) of organizations. They desire several collaborative capabilities ranging from discussion forms to knowledge sharing to requests on activity streams.
This data preparation technology provides support for ad hoc and other agile approaches to working with data that maps to how business actually operate. Taking a dedicated approach can help simplify and speed data preparation and add value by enabling users to perform analysis sooner and allocate more time to it. If you have not taken a look at how data preparation can improve analytics and operational processes, I recommend that you start now. Organizations are saving time and becoming more effective by focusing more on business value-adding tasks.
CEO and Chief Research Officer
Big data has become a big deal as the technology industry has invested tens of billions of dollars to create the next generation of databases and data processing. After the accompanying flood of new categories and marketing terminology from vendors, most in the IT community are now beginning to understand the potential of big data. Ventana Research thoroughly covered the evolving state of the big data and information optimization sector in 2014 and will continue this research in 2015 and beyond. As it progresses the importance of making big data systems interoperate with existing enterprise and information architecture along with digital transformation strategiesbecomes critical. Done properly companies can take advantage of big data innovations to optimize their established business processes and execute new business strategies. But just deploying big data and applying analytics to understand it is just the beginning. Innovative organizations must go beyond the usual exploratory and root-cause analyses through applied analytic discovery and other techniques. This of course requires them to develop competencies in information management for big data.
Among big data technologies, the open source Hadoop has been commercialized by now established providers including Cloudera, Hortonworks and MapR and made available in the cloud through platforms such as Qubole, which received a Ventana Research Technology Innovation Award in 2014. Other big data technologies are growing as well; for example, use of in-memory and specialized databases also is growing like Hadoop in more than 40 percent of organizations, according to our big data integration benchmark research. These technologies have been integrated into databases or what I call hybrid big data appliances like those from IBM, Oracle, SAP and Teradata that bring the power of Hadoop to the RDBMS and exploit in-memory processing to perform ever faster computing. When placed into hosted and cloud environments these appliances can virtualize big data processing. Another new provider, Splice Machine, brings the power of SQL processing in a scalable approach that uses Hadoop in a cloud-based approach; it received a Ventana Research Technology Leadership Award last year. Likewise advances in NoSQL approaches help organizations process and utilize semistructured information along with other information and blend them with analytics as Datawatch does. These examples show that disruptive technologies still have the potential to revolutionize our approaches to managing information.
Our firm also explores what we call information optimization, which assesses techniques for gaining full value from business information. Big data is one of these when used effectively in an enterprise information architecture. In this context the “data lake” analogy is not helpful in representing the full scope of big data, suggesting simply a container like a data marts or data warehouse. With big data, taking an architectural approach is critical. This viewpoint is evident in our 2014 Ventana Research Technology Innovation Award in Information Management to Teradata for its Unified Data Architecture. Another award winner, Software AG, blends big data and information optimization using its real-time and in-memory processing technologies.
Businesses need to process data in rapid cycles, many in real time and what we call operational intelligence, which utilizes events and streams and provides the ability to sense and respond immediately to issues and opportunities in organizations that adapt to a data-driven culture. Our operational intelligence research finds that monitoring, alerting and notification are the top use cases for deployment, in more than half of organizations. Also machine data can help businesses optimize not just IT processes but business processes that help govern and control the security of data in the enterprise. This imperative is evident in the dramatic growth of suppliers such as Splunk, Sumo Logic and Savi Technology, all of which won Ventana Research Technology Innovation awards for how they process machine and business data in large volumes at rapid velocity.
Another increasing trend in big data is presenting it in ways that ordinary users can understand quickly. Discovery and advanced visualization is not enough for business users who are not trained to interpret these presentations. Some vendors can present locationand geospatial data on maps that are easier to understand. At the other end of the user spectrum data scientists and analysts need more robust analytic and discovery tools, including predictive analytics, which is a priority for many organizations, according toour big data analytics research. In 2015 we will examine the next generation of predictive analytics in new benchmark research. But there is more work to do to present insights from information that are easy to understand. Some analytics vendors are telling stories by linking pages of content, but these narratives don’t as yet help individuals assess and act. Most analytics tools can’t match the simple functionality of Microsoft PowerPoint, placing descriptive titles, bullets and recommendations on a page with a graphic that represents something important to these business professional who reads it. Deeper insights may come from advances in machine learning and cognitive computing that have arrived on the market and bring more science to analytics.
So we strong potential for the outputs of big data, but they don’t arrive just by loading data into these new computing environments. Pragmatic and experienced professionals realize that information management processes do not disappear. A key one in this area is data preparation, which helps ready data sets for processing into big data environments. Preparing data is the second-most important task for 46 percent of organizations in our big data integration research. A second is data integration, which some new tools can automate. This can enable lines of business and IT to work together on big data integration, as 41 percent of organizations in our research are planning to do. To address this need a new generation of technologies came into their own in 2014 including those that received Ventana Research Technology Innovation Awards like Paxata and Tamr but also Trifacta.
Yet another area to watch is the convergence of big data and cloud computing. The proliferation of data sources in the cloud forces organizations to managed and integrate data from a variety of cloud and Internet sources, hence the rise of information as a service for business needs. Ventana Research Technology Innovation Award winner DataSift provides information as a service to blend social media data with other big data and analytics. Such techniques require more flexible environments for integration that can operate anywhere at any time. Dell Boomi, MuleSoft, SnapLogic and others now challenge established data integration providers such as Informatica and others including IBM, Oracle and SAP. Advances in master data management, data governance, data quality and integration backbones, and Informatica and Information Builders help provide better consistency of any type of big data for any business purpose. In addition our research finds that data security is critical for big data in 61 percent of organizations; only 14 percent said that is very adequate in their organization.
There is no doubt that big data is now widespread; almost 80 percent of organizations in our information optimization research, for example, will be using it some form by the end of 2015. This is partly due to increased use across the lines of business; our research on next-generation customer analytics in 2014 shows that it is important to improving understanding customers in 60 percent of organizations, is being used in one-fifth of organizations and will be in 46 percent by the end of this year. Similarly our next-generation finance analytics research in 2014 finds big data important to 37 percent of organizations, with 13 percent using it today and 42 percent planning to by the end of 2015. And we have already measured how it will impact human capital management and HR and where organizations are leveraging it in this area of importance.
I invite you to download and peruse our big data agenda for 2015. We will examine how organizations can instrument information optimization processes that use big data and pass this guidance along. We will explore big data’s role in sales and product areas and produce new research on data and analytics in the cloud. Our research will uncover best practices that innovative organizations use not only to prepare and integrate big data but also more tightly unify it with analytics and operations across enterprise and cloud computing environments. For many organizations taking on this challenge and seeking its benefits will require new information platforms and methods to access and provide information as part of their big data deployments. (Getting consistent information across the enterprise is the top benefit of big data integration according to 39 percent of organizations.) We expect 2015 to be a big year for big data and information optimization. I look forward to providing more insights and information about big data and helping everyone get the most from their time and investments in it.
CEO and Chief Research Officer