You are currently browsing the tag archive for the ‘Predictive Analytics’ tag.
Big data has become a big deal as the technology industry has invested tens of billions of dollars to create the next generation of databases and data processing. After the accompanying flood of new categories and marketing terminology from vendors, most in the IT community are now beginning to understand the potential of big data. Ventana Research thoroughly covered the evolving state of the big data and information optimization sector in 2014 and will continue this research in 2015 and beyond. As it progresses the importance of making big data systems interoperate with existing enterprise and information architecture along with digital transformation strategiesbecomes critical. Done properly companies can take advantage of big data innovations to optimize their established business processes and execute new business strategies. But just deploying big data and applying analytics to understand it is just the beginning. Innovative organizations must go beyond the usual exploratory and root-cause analyses through applied analytic discovery and other techniques. This of course requires them to develop competencies in information management for big data.
Among big data technologies, the open source Hadoop has been commercialized by now established providers including Cloudera, Hortonworks and MapR and made available in the cloud through platforms such as Qubole, which received a Ventana Research Technology Innovation Award in 2014. Other big data technologies are growing as well; for example, use of in-memory and specialized databases also is growing like Hadoop in more than 40 percent of organizations, according to our big data integration benchmark research. These technologies have been integrated into databases or what I call hybrid big data appliances like those from IBM, Oracle, SAP and Teradata that bring the power of Hadoop to the RDBMS and exploit in-memory processing to perform ever faster computing. When placed into hosted and cloud environments these appliances can virtualize big data processing. Another new provider, Splice Machine, brings the power of SQL processing in a scalable approach that uses Hadoop in a cloud-based approach; it received a Ventana Research Technology Leadership Award last year. Likewise advances in NoSQL approaches help organizations process and utilize semistructured information along with other information and blend them with analytics as Datawatch does. These examples show that disruptive technologies still have the potential to revolutionize our approaches to managing information.
Our firm also explores what we call information optimization, which assesses techniques for gaining full value from business information. Big data is one of these when used effectively in an enterprise information architecture. In this context the “data lake” analogy is not helpful in representing the full scope of big data, suggesting simply a container like a data marts or data warehouse. With big data, taking an architectural approach is critical. This viewpoint is evident in our 2014 Ventana Research Technology Innovation Award in Information Management to Teradata for its Unified Data Architecture. Another award winner, Software AG, blends big data and information optimization using its real-time and in-memory processing technologies.
Businesses need to process data in rapid cycles, many in real time and what we call operational intelligence, which utilizes events and streams and provides the ability to sense and respond immediately to issues and opportunities in organizations that adapt to a data-driven culture. Our operational intelligence research finds that monitoring, alerting and notification are the top use cases for deployment, in more than half of organizations. Also machine data can help businesses optimize not just IT processes but business processes that help govern and control the security of data in the enterprise. This imperative is evident in the dramatic growth of suppliers such as Splunk, Sumo Logic and Savi Technology, all of which won Ventana Research Technology Innovation awards for how they process machine and business data in large volumes at rapid velocity.
Another increasing trend in big data is presenting it in ways that ordinary users can understand quickly. Discovery and advanced visualization is not enough for business users who are not trained to interpret these presentations. Some vendors can present locationand geospatial data on maps that are easier to understand. At the other end of the user spectrum data scientists and analysts need more robust analytic and discovery tools, including predictive analytics, which is a priority for many organizations, according toour big data analytics research. In 2015 we will examine the next generation of predictive analytics in new benchmark research. But there is more work to do to present insights from information that are easy to understand. Some analytics vendors are telling stories by linking pages of content, but these narratives don’t as yet help individuals assess and act. Most analytics tools can’t match the simple functionality of Microsoft PowerPoint, placing descriptive titles, bullets and recommendations on a page with a graphic that represents something important to these business professional who reads it. Deeper insights may come from advances in machine learning and cognitive computing that have arrived on the market and bring more science to analytics.
So we strong potential for the outputs of big data, but they don’t arrive just by loading data into these new computing environments. Pragmatic and experienced professionals realize that information management processes do not disappear. A key one in this area is data preparation, which helps ready data sets for processing into big data environments. Preparing data is the second-most important task for 46 percent of organizations in our big data integration research. A second is data integration, which some new tools can automate. This can enable lines of business and IT to work together on big data integration, as 41 percent of organizations in our research are planning to do. To address this need a new generation of technologies came into their own in 2014 including those that received Ventana Research Technology Innovation Awards like Paxata and Tamr but also Trifacta.
Yet another area to watch is the convergence of big data and cloud computing. The proliferation of data sources in the cloud forces organizations to managed and integrate data from a variety of cloud and Internet sources, hence the rise of information as a service for business needs. Ventana Research Technology Innovation Award winner DataSift provides information as a service to blend social media data with other big data and analytics. Such techniques require more flexible environments for integration that can operate anywhere at any time. Dell Boomi, MuleSoft, SnapLogic and others now challenge established data integration providers such as Informatica and others including IBM, Oracle and SAP. Advances in master data management, data governance, data quality and integration backbones, and Informatica and Information Builders help provide better consistency of any type of big data for any business purpose. In addition our research finds that data security is critical for big data in 61 percent of organizations; only 14 percent said that is very adequate in their organization.
There is no doubt that big data is now widespread; almost 80 percent of organizations in our information optimization research, for example, will be using it some form by the end of 2015. This is partly due to increased use across the lines of business; our research on next-generation customer analytics in 2014 shows that it is important to improving understanding customers in 60 percent of organizations, is being used in one-fifth of organizations and will be in 46 percent by the end of this year. Similarly our next-generation finance analytics research in 2014 finds big data important to 37 percent of organizations, with 13 percent using it today and 42 percent planning to by the end of 2015. And we have already measured how it will impact human capital management and HR and where organizations are leveraging it in this area of importance.
I invite you to download and peruse our big data agenda for 2015. We will examine how organizations can instrument information optimization processes that use big data and pass this guidance along. We will explore big data’s role in sales and product areas and produce new research on data and analytics in the cloud. Our research will uncover best practices that innovative organizations use not only to prepare and integrate big data but also more tightly unify it with analytics and operations across enterprise and cloud computing environments. For many organizations taking on this challenge and seeking its benefits will require new information platforms and methods to access and provide information as part of their big data deployments. (Getting consistent information across the enterprise is the top benefit of big data integration according to 39 percent of organizations.) We expect 2015 to be a big year for big data and information optimization. I look forward to providing more insights and information about big data and helping everyone get the most from their time and investments in it.
CEO and Chief Research Officer
Many businesses are close to being overwhelmed by the unceasing growth of data they must process and analyze to find insights that can improve their operations and results. To manage this big data they find a rapidly expanding portfolio of technology products. A significant vendor in this market is SAS Institute. I recently attended the company’s annual analyst summit, Inside Intelligence 2014 (Twitter Hashtag #SASSB). SAS reported more than $3 billion in software revenue for 2013 and is known globally for its analytics software. Recently it has become a more significant presence in data management as well. SAS provides applications for various lines of business and industries in areas as diverse as fraud prevention, security, customer service and marketing. To accomplish this it applies analytics to what is now called big data, but the company has many decades of experience in dealing with large volumes of data. Recently SAS set a goal to be the vendor of choice for the analytic, data and visualization software needs for Hadoop. To achieve this aggressive goal the company will have to make significant further investments in not only its products but also marketing and sales. Our benchmark research on big data analytics shows that three out of four (76%) organizations view big data analytics as analyzing data from all sources, not just one, which sets the bar high for vendors seeking to win their business.
In the last few years SAS has been investing heavily to expand its portfolio in big data. Today its in-memory infrastructure can operate within Hadoop, execute MapReduce jobs, access the various commercial distributions of Hadoop, conduct data preparation and modeling in Hadoop and extend it to its data and visual discovery and exploration tools. SAS has architected its analytics tools and platform to use Hadoop’s Pig and Hive interfaces, apply MapReduce to process large data sets and use Hadoop Distributed File System (HDFS) to store and access the big data. To exploit Hadoop more deeply, the SAS LASR Analytic Server (part of SAS Visual Analytics) connects directly to HDFS to speed performance. SAS LASR Analytic Server is an in-memory computing platform for data processing and analysis that can scale up and operate in parallel within Hadoop to distribute the computation and data workloads. This flexibility in the architecture enables users to adapt SAS to any type of big data, especially Hadoop deployments, for just about any scale and configuration. To work with other database-oriented technologies the company has built technical partnerships not only with major players Teradata and SAP but also with the new breed of Hadoop vendors Cloudera, Hortonworks and Pivotal, as well as with IBM BigInsights. SAS also engineered access to SAP HANA, which establishes further integrated into SAP’s data platform for analytics and other applications.
At the Inside Intelligence gathering, SAS demonstrated its new Visual Statistics product. Like its Visual Analytics this one is available online for evaluation. It offers sophisticated support for analysts and data professionals who need more than just a visually interactive analytic tool of the sort that many providers now sell. Developing a product like Visual Statistics is a smart move according to our research, which finds that predictive analytics and statistics is the most important area of big data analytics, cited by 78 percent of organizations. At this point visual and data discovery are most common, but we see that users are looking for more. SAS Visual Statistics can conduct in-memory statistical processing and compute results inside Hadoop before the data is transferred to another analytic data repository or read directly into an analytics tool. A demonstration of these capabilities at the analyst summit revealed how these capabilities along with the use of tools in SAS 9.4 could raise the bar for sophisticated analytics tools for business.
SAS also has a data management software suite for data integration, quality, mastering and governance and is working to make the product known for its big data support. This is another important area: Our research in big data analytics finds quality and consistency of data to be significant challenges for 56 percent of organizations and also that 47 percent are not satisfied with integration of information for creating big data analytics. SAS is expanding to provide data quality tools for Hadoop. Its portfolio is expansive in this area, but it should take steps to market these capabilities better, which spokespeople said it will do in 2014. Our recent research in information optimization found that organizations still are spending disproportionate amounts of time in preparing data (47%) and reviewing it (45%) for analytics. They need to address these difficulties to free their analysts to spend more time on analysis that produces recommendations for decision-makers and to collaborate on business improvement. SAS’s efforts to integrate data and analytics should help reduce the time spent on preparation and help analysts focus on what matters.
SAS also will expand its SAS Stream Processing Engine with a new release coming by midyear. This product can process data as it is being generated, which facilitates real-time analytics – that’s the third-most important type of big data analytics according to our research. Applying analytics in real time is the most important aspect of in-memory computing for two-thirds (65%) of organizations and is critical as SAS expands its SAS LASR Analytic Server. Our benchmark research on operational intelligence shows that the processing of event data is critical for areas like activity or event monitoring (said 62% of participants) and alerting and notification (59%). SAS will need to expand its portfolio in these areas but it is fulfilling on what I call the four types of discovery for big data.
SAS also is moving deeper into cloud computing with support for both private and public clouds through investments in its own data centers. Cloud computing is an increasingly popular approach to building a sandbox environment for big data analytics. Our research finds that more than one-fifth of organizations prefer to use cloud computing in an on-demand approach. SAS will have to provide even more of its portfolio using big data in the cloud or risk customers turning to Amazon and others for processing and potentially other computing uses. SAS asserts it is investing and expanding in cloud computing.
SAS’s efforts to make it easier to work with big data and apply analytics is another smart bet; our research finds that most organizations today don’t have enough skilled resources in this area. One way to address this gap is to design software that is more intuitive, more visual and more interactive but sophisticated in how it works with the primitives of Hadoop; SAS is addressing this challenge. Our research finds growth of in-memory (now used by 42%) and Hadoop (50%) technologies, which will have more impact as they are applied directly to business needs and opportunities. SAS is at the intersection of data management and analytics for big data technologies, which could position it well for further growth in revenue. SAS is betting that big data will become a focal point in many deployments and they can help unify data and analytics across the enterprise. Our research agenda for 2014 finds this to be the big opportunity and SAS is fixated on being the vendor of choice for it. If you have not examined how SAS can connect big data architectures and facilitate use of this important technology, it will be worth your time to do so.
CEO & Chief Research Officer