You are currently browsing the category archive for the ‘Location Intelligence’ category.
I had the pleasure of attending Cloudera’s recent analyst summit. Presenters reviewed the work the company has done since its founding six years ago and outlined its plans to use Hadoop to further empower big data technology to support what I call information optimization. Cloudera’s executive team has the co-founders of Hadoop who worked at Facebook, Oracle and Yahoo when they developed and used Hadoop. Last year they brought in CEO Tom Reilly, who led successful organizations at ArcSight, HP and IBM. Cloudera now has more than 500 employees, 800 partners and 40,000 users trained in its commercial version of Hadoop. The Hadoop technology has brought to the market an integration of computing, memory and disk storage; Cloudera has expanded the capabilities of this open source software for its customers through unique extension and commercialization of open source for enterprise use. The importance of big data is undisputed now: For example, our latest research in big data analytics finds it to be very important in 47 percent of organizations. However, we also find that only 14 percent are very satisfied with their use of big data, so there is plenty of room for improvement. How well Cloudera moves forward this year and next will determine its ability to compete in big data over the next five years.
Cloudera’s technology supports what it calls an enterprise data hub (EDH), which ties together a series of integrated components for big data that include batch processing, analytic SQL, a search engine, machine learning, event stream processing and workload management; this is much like the way relational databases and tools evolved in the past. These features also can deal with the types of big data most often used, according to our research: 40 percent or more use five types, from transactional data (60%) to machine data (42%). Hadoop combines layers of the data and analytics stack from collection, staging and storage to data integration and integration with other technologies. For its part Cloudera has a sophisticated focus on both engineering and customer support. Its goal is to enable enterprise big data management that can connect and integrate with other data and applications from its range of partners. Cloudera also seeks to facilitate converged analytics. One of these partners, Zoomdata, demonstrated the potential of big data analytics in analytic discovery and exploration through its visualization on the Cloudera platform; its integrated and interactive tool can be used by business people as well as professionals in analytics, data management and IT.
Cloudera latest major release with Cloudera Enterprise 5 brought a range of enterprise advancements from in-memory processing, resource management, data management, data protection to name a few. Cloudera offers a range of product options that they announced to make it easier to embrace their Hadoop technology. Cloudera Express is its free version of Hadoop, and it provides three editions licensed through subscription: basic, flex and data hub. The Flex Edition of Cloudera Enterprise has support for analytic SQL, search, machine learning, event stream processing and online NoSQL through the Hadoop components HBase, Impala, Spark and Navigator; a customer organization can have one of these per Hadoop cluster. The Enterprise Data Hub (EDH) Edition enables use of any of the components in any configuration. Cloudera Navigator is a product for managing metadata, discovery and lineage, and in 2014 it will add search, annotation and registration on metadata. Cloudera uses Apache Hive to support SQL through HiveQL, and Cloudera Impala provides a unique interface to the Hadoop file system HDFS using SQL. This is in line with what our research shows organizations prefer: More than half (52%) use standard SQL to access Hadoop. This range of choices in getting to data within Hadoop helps Cloudera’s customers realize a broad range of uses that include predictive customer care, market risk management, customer experience and other areas where very large volumes of information can be applied for applications that were not cost-effective before. With EDH Edition Cloudera can compete directly with large players IBM, Oracle, SAS and Teradata, all of which have ambitions to provide the hub of big data operations for enterprises.
Having open source roots, community is especially important to Hadoop. Part of building a community is providing training to certify and validate skills. Cloudera has enrolled more than 50,000 professionals in its Cloudera University and works with online learning provider Udacity to increase the number of certified Hadoop users. It also has developed academic relationships to promote Hadoop skills being taught to computer science students. Our research finds that this sort of activity is necessary: The most common challenge in big data analytics processes for two out of three (67%) organizations is not having enough skilled resources; we have found similar issues in the implementation and management of big data. The other aspect of a community is to enlist partners that offer specific capabilities. I am impressed with Cloudera’s range of partners, from OEMs and system integrators to channel resellers such as Cisco, Dell, HP, NetApp and Oracle to support in the cloud from Amazon, IBM, Verizon and others.
To help it keep up Cloudera announced it has raised another $160 million from the likes of T. Rowe Price, Michael Dell Ventures and Google Ventures to add to financing from venture capital firms. With this funding Cloudera outlined its investment focus for 2014 which will concentrate on advancing database and storage, security, in-memory computing and cloud deployment. I believe that it will need to go further to meet the growing needs for integration and analytics and prove that it can provide a high-value integrated offering directly as well as through partners. Investing in its Navigator product also is important, as our research finds that quality and consistency of data is the most challenging aspect of the big data analytics process in 56 percent of organizations. At the same time, Cloudera should focus on optimizing its infrastructure for the four types of data discovery that are required according to our analysis.
Cloudera’s advantage is being the focal point in the Hadoop ecosystem while others are still trying to match its numbers in developers and partners to serve big data needs. Our research finds substantial growth opportunity here: Hadoop will be used in 30 percent of organizations through 2015 and another 12 percent are planning to evaluate it. Our research also finds a significant lead for Cloudera in Hadoop distributions, but other options like Hortonworks and MapR are growing. The research finds that the most of these organizations are seeking the ability to respond faster to opportunities and threats; to do that they will need to have a next generation of skills to apply to big data projects. Our research in information optimization finds that over half (56%) of organizations are planning to use big data and Hadoop will be a key focus for those efforts. Cloudera has a strong position in the expanding big data market because it focuses on the fundamentals of information management and analytics through Hadoop. But it faces stiff competition from the established providers of RDBMSs and data appliances that are blending Hadoop with their technology as well as from a growing number of providers of commercial versions of Hadoop. Cloudera is well managed and has finances to meet these challenges; now it needs to be able to show many high-value production deployments in 2014 as the center of business’s big data strategies. If you are building a big data strategy with Hadoop, Cloudera must be in the evaluation priority for an organization.
CEO & Chief Research Officer
Many businesses are close to being overwhelmed by the unceasing growth of data they must process and analyze to find insights that can improve their operations and results. To manage this big data they find a rapidly expanding portfolio of technology products. A significant vendor in this market is SAS Institute. I recently attended the company’s annual analyst summit, Inside Intelligence 2014 (Twitter Hashtag #SASSB). SAS reported more than $3 billion in software revenue for 2013 and is known globally for its analytics software. Recently it has become a more significant presence in data management as well. SAS provides applications for various lines of business and industries in areas as diverse as fraud prevention, security, customer service and marketing. To accomplish this it applies analytics to what is now called big data, but the company has many decades of experience in dealing with large volumes of data. Recently SAS set a goal to be the vendor of choice for the analytic, data and visualization software needs for Hadoop. To achieve this aggressive goal the company will have to make significant further investments in not only its products but also marketing and sales. Our benchmark research on big data analytics shows that three out of four (76%) organizations view big data analytics as analyzing data from all sources, not just one, which sets the bar high for vendors seeking to win their business.
In the last few years SAS has been investing heavily to expand its portfolio in big data. Today its in-memory infrastructure can operate within Hadoop, execute MapReduce jobs, access the various commercial distributions of Hadoop, conduct data preparation and modeling in Hadoop and extend it to its data and visual discovery and exploration tools. SAS has architected its analytics tools and platform to use Hadoop’s Pig and Hive interfaces, apply MapReduce to process large data sets and use Hadoop Distributed File System (HDFS) to store and access the big data. To exploit Hadoop more deeply, the SAS LASR Analytic Server (part of SAS Visual Analytics) connects directly to HDFS to speed performance. SAS LASR Analytic Server is an in-memory computing platform for data processing and analysis that can scale up and operate in parallel within Hadoop to distribute the computation and data workloads. This flexibility in the architecture enables users to adapt SAS to any type of big data, especially Hadoop deployments, for just about any scale and configuration. To work with other database-oriented technologies the company has built technical partnerships not only with major players Teradata and SAP but also with the new breed of Hadoop vendors Cloudera, Hortonworks and Pivotal, as well as with IBM BigInsights. SAS also engineered access to SAP HANA, which establishes further integrated into SAP’s data platform for analytics and other applications.
At the Inside Intelligence gathering, SAS demonstrated its new Visual Statistics product. Like its Visual Analytics this one is available online for evaluation. It offers sophisticated support for analysts and data professionals who need more than just a visually interactive analytic tool of the sort that many providers now sell. Developing a product like Visual Statistics is a smart move according to our research, which finds that predictive analytics and statistics is the most important area of big data analytics, cited by 78 percent of organizations. At this point visual and data discovery are most common, but we see that users are looking for more. SAS Visual Statistics can conduct in-memory statistical processing and compute results inside Hadoop before the data is transferred to another analytic data repository or read directly into an analytics tool. A demonstration of these capabilities at the analyst summit revealed how these capabilities along with the use of tools in SAS 9.4 could raise the bar for sophisticated analytics tools for business.
SAS also has a data management software suite for data integration, quality, mastering and governance and is working to make the product known for its big data support. This is another important area: Our research in big data analytics finds quality and consistency of data to be significant challenges for 56 percent of organizations and also that 47 percent are not satisfied with integration of information for creating big data analytics. SAS is expanding to provide data quality tools for Hadoop. Its portfolio is expansive in this area, but it should take steps to market these capabilities better, which spokespeople said it will do in 2014. Our recent research in information optimization found that organizations still are spending disproportionate amounts of time in preparing data (47%) and reviewing it (45%) for analytics. They need to address these difficulties to free their analysts to spend more time on analysis that produces recommendations for decision-makers and to collaborate on business improvement. SAS’s efforts to integrate data and analytics should help reduce the time spent on preparation and help analysts focus on what matters.
SAS also will expand its SAS Stream Processing Engine with a new release coming by midyear. This product can process data as it is being generated, which facilitates real-time analytics – that’s the third-most important type of big data analytics according to our research. Applying analytics in real time is the most important aspect of in-memory computing for two-thirds (65%) of organizations and is critical as SAS expands its SAS LASR Analytic Server. Our benchmark research on operational intelligence shows that the processing of event data is critical for areas like activity or event monitoring (said 62% of participants) and alerting and notification (59%). SAS will need to expand its portfolio in these areas but it is fulfilling on what I call the four types of discovery for big data.
SAS also is moving deeper into cloud computing with support for both private and public clouds through investments in its own data centers. Cloud computing is an increasingly popular approach to building a sandbox environment for big data analytics. Our research finds that more than one-fifth of organizations prefer to use cloud computing in an on-demand approach. SAS will have to provide even more of its portfolio using big data in the cloud or risk customers turning to Amazon and others for processing and potentially other computing uses. SAS asserts it is investing and expanding in cloud computing.
SAS’s efforts to make it easier to work with big data and apply analytics is another smart bet; our research finds that most organizations today don’t have enough skilled resources in this area. One way to address this gap is to design software that is more intuitive, more visual and more interactive but sophisticated in how it works with the primitives of Hadoop; SAS is addressing this challenge. Our research finds growth of in-memory (now used by 42%) and Hadoop (50%) technologies, which will have more impact as they are applied directly to business needs and opportunities. SAS is at the intersection of data management and analytics for big data technologies, which could position it well for further growth in revenue. SAS is betting that big data will become a focal point in many deployments and they can help unify data and analytics across the enterprise. Our research agenda for 2014 finds this to be the big opportunity and SAS is fixated on being the vendor of choice for it. If you have not examined how SAS can connect big data architectures and facilitate use of this important technology, it will be worth your time to do so.
CEO & Chief Research Officer