You are currently browsing the tag archive for the ‘Hadoop’ tag.
Cisco Systems has announced its intent to acquire Composite Software, which provides data virtualization to help IT departments interconnect data and systems; the purchase is scheduled to complete in early August. Cisco of course is known for its ability to interconnect just about anything with its networking technology; this acquisition will help it connect data better across networks. Over the last decade Composite had been refining the science of virtualizing data but had reached the peak of what it could do by itself, struggling to grow enough to meet the expectations of its investors, board of directors, employees, the market and visionary CEO Jim Green, who is well-known for his long commitment to improving data and technology architectures. According to press reports on the Internet, Cisco paid $180 million for Composite, which if true would be a good reward for people who have worked at Composite for some time and who were substantive shareholders.
Data virtualization is among the topics we explored in our benchmark research on information management. Our findings show that the largest barrier to managing information well is the dispersion of data across too many applications and systems, which impacts two-thirds (67%) of organizations. Our research and discussions with companies suggest that data virtualization can address this problem; it is one aspect of information optimization, an increasingly important IT and business strategy, as I have pointed out. Composite has made data virtualization work in technology initiatives including business analytics and big data but overall in information architecture.
Since my last formal analysis that I wrote about Composite, its technology has advanced significantly. One area of focus had been to increase virtualization of big data technologies, realized recently in its 6.2 SP3 release furthering its support of Hadoop. Over the last two years Composite has supported distributions from Apache, Cloudera and Hortonworks, interfacing to MapReduce and Hive since its 6.1 release. In the 6.2 release in 2012, Composite added improved techniques for processing analytics and virtualizing the resulting data sets, which came not only from traditional data sources but also from HP Vertica, PostgreSQL and improved interfaces with IBM and Teradata systems. These releases and expanded data connections have made Composite’s the most versatile data virtualization technology in the industry.
As Composite Software becomes part of Cisco, it is worth remembering that acquiring technology companies has been a major part of building the company Cisco is today; the many acquisitions have expanded its product portfolio and brought together some of the brightest minds in networking technology. Less often noted is Cisco’s mixed success in acquisitions of data-related software for enterprise use. Among the examples here is CxO Systems, which had a complex event processing (CEP) and data processing technology that Cisco eventually used to advance its event processing across network technologies; the software no longer exists in stand-alone form. Another example is Latigent, a provider of call center reporting on detailed call data that was supposed to help Cisco cultivate more services; eventually this software was shown not to fit with Cisco’s technology products or services portfolio and disappeared. Cisco’s business model has been to market and sell to data center and networking environments, and it has not been successful in selling stand-alone software outside this area. It is not easy to compete with IBM, Oracle, SAP and others when it comes to technology with which Cisco lacks experience and will impact its success in monetizing the investment in purchasing Composite Software.
Emerging computing architectures will change the way IT departments will operate in the future, but it’s an open question who will lead that change from traditional IT software stacks to hardware and networking technology. Cisco plans to transform the data center and networking from software and hardware stacks, partly through virtualizing software inside premises and out into the Internet and cloud computing. We see IT architectures changing in a haphazard, vendor-driven approach while most CIOs wants to revise their architectures in an efficient manner. Data and information architectures have moved from relying on databases and applications to adapt innovative technology such as in-memory processing and grid architectures that support new trends in big data, business analytics, cloud computing, mobile technology and social collaboration. These technology innovations and pressure to use them in business are straining the backbones of enterprise networks that have to adapt and grow rapidly. Most companies today still have data centers and networking groups and are increasing their network bandwidth and buying new networking technology from Cisco and others. This will not change soon, but they need better intelligence on the networking of data in these technologies. This I think is what Cisco is looking for with the acquisition of Composite, hoping to refine the infrastructure to help in virtualization of data across the network. This approach can play well into what Cisco articulates as a strategic direction around the concept it calls the Internet of Everything (IoE). The challenge will be to convince CIOs that this energetic of a change by Cisco is necessary for improving connectivity of data across the network. Data virtualization will play a key role in efficiently interconnecting networks and systems but where this is primarily applied within the data center of networking compared to where it is centered in the stack of information management technology today.
I think the acquisition makes sense for Composite Software as it could not grow fast enough to satisfy its technological ambitions and needed more deployments to support its innovations. Cisco for its part will be tested over time to determine what it makes of Composite. I expect that Cisco will fundamentally change its roadmap and support for enterprise IT as it adjusts the technology to work with its data center and networking technology; such a change inevitably will impact its existing customers and partners. The company will place Composite in its Services organization (which I was told includes software); I take this as an indication of more maturity needed by Cisco to bring software acquisitions into its product and services portfolios and manage as software with the right level of dedicated marketing and sales resources. For those who want to learn more about data virtualization, I recommend reading the educational content of Composite in the next couple of months before its website and material disappear from the Internet. This may sound harsh but is how companies like Cisco and others digest acquisitions and eliminate the companies past and by doing so eliminate critical information. No one is saying so and could not get an answer, but the reality is that Composite will become one item in the long product and services list on Cisco’s website; my guess is it will become part of the data center and virtualization services or unified computing services; to see how many Cisco has, just check its website.
Data virtualization is just beginning to mature and attract CIOs and IT professionals who care about the ‘I’ in their titles that stands for information. Composite Software deserves credit for its focus on data virtualization. Our research shows that data virtualization is an advancing priority in information management but ranks lower than others among initiated projects (11%), as the chart shows. Composite’s technology arrived ahead of the full maturing of IT organizations limiting its full potential. Within Cisco its people will have to adapt to the new owner’s wishes, which will diffuse some of its database-centered virtualization to focus more on the network, unless Cisco decides to expand its and Composite’s combined R&D team.
Congratulations to Composite Software for its innovative contributions and sad to see its independence and passion for data virtualization to disappear. I will be watching to see if Cisco understands the company’s real value, not just for data centers and networking but for bridging to enterprise IT and the market for information optimization.
CEO & Chief Research Officer
Teradata recently gave me a technology update and a peek into the future of its portfolio for big data, information management and business analytics at its annual technology influencer summit. The company continues to innovate and build upon its Teradata 14 releases and its new processing technology. Since my last analysis of Teradata’s big data strategy, it has embraced technologies like Hadoop with its Teradata Aster Appliance, which won our 2012 Technology Innovation Award in Big Data. Teradata is steadily extending beyond providing just big data technology to offer a range of analytic options and appliances through advances in Teradata Aster and its overall data and analytic architectures. One example is its data warehouse appliance business, which according to our benchmark research is one of the key technological approaches to big data; as well Teradata has advanced support with its own technology offering for in-memory databases, specialized databases and Hadoop in one integrated architecture. It is taking an enterprise management approach to these technologies through Teradata Viewpoint, which helps monitor and manage systems and support a more distributed computing architecture.
By expanding its platform to include workload-based appliances that can support terabytes to petabytes of data, its Unified Data Architecture (UDA) can meet a broad class of enterprise needs. That can help support a range of big data analytic needs, as my colleague Tony Cosentino has pointed out, by providing a common approach to getting data from Hadoop into Teradata Aster and then into Teradata’s analytics. This UDA can begin to address challenges in data activities and tasks in the analytic process, which our research finds are issues for 42 percent of organizations. Teradata Aster Big Analytics Appliance is for organizations that are serious about retaining and analyzing more data, which 29 percent of organizations in our research cited as the top benefit of big data technology. This appliance can handle up to 5 petabytes and is tightly integrated with Aster and Hadoop technology from Hortonworks, a company that is rapidly expanding its footprint, as I have already assessed.
The packaged approach of an appliance can help organization address what our technology innovation research identified as the largest challenges in big data: not enough skilled resources (for 56% of organizations) and being hard to build and maintain (52%). These can be overcome if an organization designs a big data strategy that can apply a common set of skills, and the Teradata technology portfolio can help with that.
At the influencer summit, I was surprised that Teradata did not go into the role of data integration processes and the steps to profile, cleanse, master, synchronize and even migrate data (which its closest partner, Informatica, emphasizes) but focused more on access to and movement of data through its own connectors, Unity Data Mover, Smart Loader for Hadoop and support of SQL-H. For most of its deployments there is a range of complementary data integration technology from its partners as much as it is a Teradata only approach. For SQL-H Teradata takes advantage of the metadata HCatalog to improve access to data in HDFS. I like how Teradata Studio 14 helps simplify the view and use of data in Hadoop, Teradata Aster and even spreadsheets and flat files for building sandbox and test environments for big data. (To learn more, look into the Teradata Developer Exchange.) Teradata has made it easy to add connecters to get access to Hadoop on its Exchange which is a great way to get the latest advances in its utilities and add-ons to its offerings.
Teradata provided an early peak on the just announced Teradata Intelligent Memory, a significant step in adapting big data architectures to the next generation of memory management. This new advancement can cache and pool data that is in high demand (hot) across any number of Teradata workload-specific platforms by processing data to determine the importance of data (described as hot, warm or cold) for fast and efficient access and applying analytics. This technological feat can then utilize both solid-state and conventional disk storage to ensure the fastest access and computation of the data for a range of needs. This is a unique and powerful way to support an extended memory space for big data and to intelligently adapt to the data patterns of user organizations; its algorithms can interoperate across Teradata’s family of appliances.
Teradata has also invested further into its data and computing architecture through what it calls fabric-based computing. That can help connect nodes across systems through access on the company’s Fabric Switch using its BYNET, Infiniband and other methods. (Teradata participates in the OpenFabrics Alliance, which works to optimize access and interconnection of systems data across storage-area networks.) Fabric Switch provides an access point through which other aspects of Teradata’s UDA can access and use data for various purposes, including backup and restore or data movement. These advances will significantly increase the throughput and combined reliability of systems and enhance performance and scalability at both the user and data levels.
Tony Cosentino pointed out the various types of analytics that Teradata can support; one of them is analytics for discovery through its recently launched Teradata Aster Discovery Platform. This directly addresses two of the four types of discovery I have just outlined : data and visual discovery. Teradata Aster has a powerful library of analytics such as path, text, statistical, cluster and other areas as core elements of its platform. Its nPath analytic expression has significant potential in enabling Aster to process distributed sets of data from Teradata and Hadoop in one platform. Analytic architectures should apply the same computational analytics across systems, from core database technology to Teradata Aster to the analytics tools that an analyst is actually using. Aster’s approach to visual and data discovery is challenging in that it requires a high level of expertise in SQL to make customizations; the majority of analysts that could use this technology don’t have that level of knowledge. But here Teradata can turn to partners such as MicroStrategy and Tableau, which have built more integrated support for Teradata Aster and offer easier to use that are interactive and visual designed for analysts who do not want to muck with SQL. Teradata has internal challenges in improving support for analysts and the analytic processes they are responsible for; its IT-focused, data-centric approach will not help here. Our big data research finds that staffing and training are the top two barriers for using this technology, according to more than 77 percent of organizations; vendors should note this and reduce the custom and manual work that requires specific SQL and data skills in their products.
Regarding analytics specifically, Teradata has continued to deepen its analytics efforts with partner SAS. A new release of Teradata Appliance supports SAS High-Performance Analytics for up to 52 terabytes of data and also supports SAS Visual Analytics, which I have tried and assessed and tried myself.
Through its Teradata Aprimo applications Teradata continues its efforts to attract marketing executives in business-to-consumer companies that require big data technology to utilize a broad range of information. Teradata has outlined a larger role for the CMO with big data and analytics capabilities that go well beyond its marketing automation software. The company announced expansion to support predictive analytics and has outlined its direction for supporting customer engagement. It needs to take steps such as these to ensure it tunes into business needs beyond what CIOs and IT are doing with Teradata as a big data environment for the enterprise.
Along these lines I have also pointed out that we should be cautious about accepting research that predicts the CMO will outspend the CIO in the future. What I have seen in these assertions is flawed in many facets and often come from those who have no experience in market research and the role marketing and dealing with technology expenditure in that context. As we have done research into both the business and IT sides, we have discovered the complexities of making practical technology investments; for example, our research into customer relationship maturity found that inbound interactions from customers occur across many departments; they occur in marketing (in 46% of organizations), but more often through contact centers (77%), where Teradata should strengthen its efforts. On the plus side Teradata continues to demonstrate success in assisting customers in marketing, winning our 2013 Leadership Award for Marketing Excellence with its deployment at International Speedway Corp. and in 2012 at Nationwide Insurance with Teradata Aprimo. Our current research into next-generation customer engagement already identifies a need to support multichannel and multidepartment interactions. Teradata could further expand its efforts in these areas with existing customers; KPN won our 2013 Leadership Award in Customer Excellence after connecting Teradata with its Oracle-based applications and supporting BI systems.
Overall Teradata is doing a great job of focusing on its strengths in big data and areas where it can maximize the impact of its analytics, especially marketing and customer relations. While IBM, Oracle, SAP and other large technology providers in the database and analytic markets tend to minimize what Teradata has created, it is has a loyal customer base that is attracted to the expanded architectures of its appliances and its broader UDA and intelligent memory systems. I think with more focus on the processes of real business analysts and further simplifying usability, Teradata’s opportunity could grow significantly. In helping its customers process more of the vast volumes of data and information from the Internet, such as weather, demographic and social media, it could make clear the broader value of big data in optimizing information from the variety of data in content and documents. It could expand its new generation of tools and applications to exploit the use of this information as it is beginning to do with marketing applications from Teradata Aprimo. If Teradata customers find it easier to access information and share it across lines of business through social collaboration and mobile technology, that will further demand for its technology to operate on larger scales in both the number of users and the places where it can be accessed even via cloud computing. Exploiting in-memory computing along with providing more discovery potential from analytics will help its customers utilize the power of big data and trust in Teradata to supply it.
CEO & Chief Research Officer