The Wayback Machine - https://web.archive.org/web/20171204190321/http://opensource.sys-con.com/node/2854773

Welcome!

Open Source Cloud Authors: Pat Romanski, Liz McMillan, Elizabeth White, Stackify Blog, Wesley Coelho

Blog Feed Post

The Enterprise Data Hub: A place to store all your data with enterprise grade data management, integrations

By

Hadoop World/Strata has been full of activity and announcements. We will be providing more through the week. One of the more important announcements was Cloudera’s articulation of an Enterprise Data hub. The significance of this is huge for enterprise data. Imagine a place where you can store all data, structured and unstructured, for a very economical cost. This alone is a fantastic, highly desired capability. The enterprise data hub construct has far more capability and features you would expect from a well engineered solution. This includes enhancements to simplify storage, processing, analyzing and managing data. It also includes enhanced security and auditing. And very tight integration into your existing legacy infrastructure and applications.  This is going to be big.

The press release from Cloudera is below:

Cloudera Enterprise 5 Sets New Standard for Data Management; Lays Foundation for The Enterprise Data Hub

Oct/29/2013

Company Extends Category Leadership With Public Beta Release of CDH 5 and Cloudera Enterprise 5; Unveils Industry’s First Enterprise Data Hub and Analysis Platform

PALO ALTO, CA and NEW YORK, NY–(Marketwired – Oct 29, 2013) - From Strata + Hadoop World: Cloudera, the leader in enterprise analytic data management powered by Apache Hadoop™, today unveiled the fifth generation of its Platform for Big Data, Cloudera Enterprise, which is now available for public beta. The new product release, powered by Apache Hadoop 2, offers unique features and advancements that simplify storing, processing, analyzing and managing large structured and unstructured datasets, while offering increased security, robust data management and tight integration with third-party applications. The combination of innovative updates to CDH (Cloudera’s Distribution Including Apache Hadoop) at the core — plus enhancements to Cloudera Manager for Hadoop system administration and Cloudera Navigator for Hadoop audit and access control, data discovery and lineage analysis — together deliver the industry’s first Enterprise Data Hub.

“With Cloudera Enterprise 5, Cloudera has taken several important steps toward realizing its vision to transform Hadoop into an enterprise data hub for analytics,” said Tony Baer, Principal Analyst for Ovum. “Adding support for in-memory data tiering and user-defined functions are essential for delivering the kind of performance that enterprises expect from their analytic data platforms.”

Rethink Data: Introducing the Enterprise Data Hub
Organizations currently employ a variety of systems to support their diverse data hub goals: data warehouses for operational reporting; storage systems to keep data available and safe; specialized massively-parallel databases for large-scale analytics; and search systems for finding and exploring documents. While these systems are suitable for traditional data and workloads, they are not equipped to handle today’s exponential growth in data volume and variety, or the range of users who seek insights from that data. And because each system is purpose-built for a particular class of data and workload, no single system can provide unified access to all relevant information to diverse business users. A new hybrid approach is required, which pragmatically extends the value of existing investments while enabling fundamentally new ways of delivering value from data.

The objective is simple: Acquire and combine any amount or type of data in its original fidelity, in one place, for as long as is necessary, and deliver insights to all kinds of users, as fast as possible. And do so with maximum efficiency of capital and resources.

The solution? The Enterprise Data Hub. One place to store and work with all data, with the flexibility to run a variety of enterprise workloads — including batch processing, interactive SQL, enterprise search and advanced analytics — together with the integrations to existing systems, robust security, governance, data protection, and management that enterprises require. The Enterprise Data Hub is the emerging and necessary center of enterprise data management, complementing existing infrastructure.

Cloudera Enterprise 5: Next Generation Platform for Big Data powered by Apache Hadoop
Built for the demanding requirements of enterprise customers, Cloudera Enterprise enables companies to store, process and analyze unlimited amounts of data and applications from a single system. The newest innovations in Cloudera Enterprise 5 offer customers a significant leap forward in the evolution of the platform, which can now be used to efficiently address an even wider range of business problems. Customers can now use Cloudera to easily handle the rapidly increasing data volume and variety they face, absorbing a growing share of data and workloads from legacy infrastructure while optimizing the efficiency of those existing systems.

Cloudera Enterprise 5 offers a single platform from which organizations can tackle diverse critical business problems:

  • Automatically archiving the complete set of enterprise data to meet compliance requirements while retaining queryable access;
  • Complementing data warehouses to offload data and workloads to help customers increase efficiency and manage costs, while delivering faster ETL/ELT data processing at scale;
  • Supporting business intelligence, through familiar tools, on more data and more kinds of data than ever before possible;
  • Enabling and consolidating enterprise search on data and documents in-place within the single environment; and
  • Accelerating a diverse array of advanced analytics solutions, like recommendation engines, fraud detection or image processing.

Increasingly, strategic partners like Informatica are certifying reference architectures to bring these benefits to joint customers. For example, Informatica and Cloudera together provide a “Data Warehouse Optimization” solution to address the challenges facing traditional data warehouse infrastructures, where capacity is too quickly consumed by increasing data volumes, leading to performance bottlenecks and costly upgrades.

Key advances in Cloudera Enterprise 5 include:

Accelerated Time-to-Value

  • In-Memory HDFS Caching: Datasets from HDFS can now be cached in-memory, boosting MapReduce data processing performance and Cloudera Impala’s analytic query response times for even faster time to insight.
  • User-Defined Functions (UDFs): Customers can now use the custom query functions they depend on in conjunction with Cloudera Impala to deliver the business insights they require. They can also take advantage of the popular open source MADlib library of pre-built statistical and analytic functions to enable scalable in-database analytics.

Improved Efficiency

  • Resource Management: Cloudera Enterprise now delivers advanced resource management for running multiple frameworks for data processing and analysis on a single cluster through the powerful combination of Hadoop YARN (Yet Another Resource Negotiator) and Cloudera Manager. For the first time, administrators can allocate resources not only by workload, but by workgroup, ensuring the best combination of performance and utilization. For example, customers can dedicate 50% of capacity for IT to run mission critical data processing jobs, 30% to the marketing team for ad-hoc BI queries, and so on.
  • Unified Management of Third Party Applications. Cloudera Manager now provides extensibility to enable customers to deploy, manage and monitor products from Cloudera partners such as SAS, Revolution Analytics, Syncsort and many more. Now, customers can manage complex clustered environments from within a single, intuitive interface.

Comprehensive Data Management

  • Manage and Explore Big Data. In addition to enabling centralized data auditing for Hadoop, Cloudera Navigator now provides:
    • Data Discovery: Analysts and data modelers can search, explore, define and tag datasets through the Cloudera Navigator interface, to help identify relevant information for downstream analysis or processing.
    • Data Lineage: As the amount of data in Cloudera Enterprise grows, so does the importance of understanding how that data is used across the organization. Cloudera Navigator delivers the industry’s first data lineage solution for Hadoop, enabling customers to meet regulatory requirements, find associated datasets, and satisfy data governance and retention policies.
  • Data Protection: HDFS and HBase now support snapshots to help prevent data loss.
  • NFS-based Data and Application Access: Easily integrate Cloudera Enterprise with data in and applications running on existing filesystems with native support for NFSv3.

“Over the last five years, we have worked closely with enterprises around the world to help them capture the value in the data they have. Resoundingly, they have asked for a more secure, more reliable real-time data platform that streamlines their existing architectures and speeds up time to insight,” said Mike Olson, chairman and chief strategy officer, Cloudera. “The market has spoken and we are listening. The new capabilities introduced in Cloudera Enterprise 5 deliver the industry’s first Enterprise Data Hub.”

Product Availability and Documentation
Public beta releases of Cloudera Enterprise 5 and CDH 5 are now available. To learn more about Cloudera Enterprise 5, visit http://cloudera.com/CE5. To learn more about CDH 5, or to download it for free, visithttp://www.cloudera.com/content/cloudera/en/products/cdh.html.

The Cloudera Enterprise Data Hub is available today on Cloudera Enterprise 4, for more information contact Cloudera on [email protected].

This information is not a commitment, promise or legal obligation to deliver any material, code, or functionality. Cloudera does not guarantee that the beta software will be made generally available or that any individual feature in the beta version will be made generally available. Cloudera may make the beta software generally available, or not, in its sole discretion and without obligation to make any communication of any kind with regard to such availability.

About Cloudera
Cloudera is revolutionizing enterprise data management by offering the first unified Platform for Big Data: The Enterprise Data Hub. Cloudera offers enterprises one place to store, process and analyze all their data, empowering them to extend the value of existing investments, while enabling fundamental new ways to derive value from their data. Founded in 2008, Cloudera was the first, and is still today, the leading provider and supporter of Hadoop for the enterprise. Cloudera also offers software for business critical data challenges, including storage, access, management, analysis, security and search. With over 15,000 individuals trained, Cloudera is a leading educator of data professionals, offering the industry’s broadest array of Hadoop training and certification programs. Cloudera works with over 700 hardware, software and services partners to meet customers’ big data goals. Leading organizations in every industry run Cloudera in production, including finance, telecommunications, retail, internet, utilities, oil and gas, healthcare, biopharmaceuticals, networking and media, plus top public sector organizations globally. www.cloudera.com

Connect with Cloudera
Read our blog: http://blog.cloudera.com/blog/
Follow us on Twitter: https://twitter.com/cloudera
Visit us on Facebook: https://www.facebook.com/cloudera

Cloudera, Cloudera Manager, Cloudera Navigator, CDH, Cloudera Enterprise, Cloudera Standard and Cloudera Enterprise Data Hub are trademarks or registered trademarks of Cloudera in the United States and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners.

Read the original blog entry...

More Stories By Bob Gourley

Bob Gourley writes on enterprise IT. He is a founder and partner at Cognitio Corp and publsher of CTOvision.com

@ThingsExpo Stories
Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection between Coke and its customers. Digital signs pair software with high-resolution displays so that a message can be changed instantly based on what the operator wants to communicate or sell. In their Day 3 Keynote at 21st Cloud Expo, Greg Chambers, Global Group Director, Digital Innovation, Coca-Cola, and Vidya Nagarajan, a Senior Product Manager at Google, discussed how from store operations and ...
It is of utmost importance for the future success of WebRTC to ensure that interoperability is operational between web browsers and any WebRTC-compliant client. To be guaranteed as operational and effective, interoperability must be tested extensively by establishing WebRTC data and media connections between different web browsers running on different devices and operating systems. In his session at WebRTC Summit at @ThingsExpo, Dr. Alex Gouaillard, CEO and Founder of CoSMo Software, presented ...
SYS-CON Events announced today that Synametrics Technologies will exhibit at SYS-CON's 22nd International Cloud Expo®, which will take place on June 5-7, 2018, at the Javits Center in New York, NY. Synametrics Technologies is a privately held company based in Plainsboro, New Jersey that has been providing solutions for the developer community since 1997. Based on the success of its initial product offerings such as WinSQL, Xeams, SynaMan and Syncrify, Synametrics continues to create and hone inn...
WebRTC is great technology to build your own communication tools. It will be even more exciting experience it with advanced devices, such as a 360 Camera, 360 microphone, and a depth sensor camera. In his session at @ThingsExpo, Masashi Ganeko, a manager at INFOCOM Corporation, introduced two experimental projects from his team and what they learned from them. "Shotoku Tamago" uses the robot audition software HARK to track speakers in 360 video of a remote party. "Virtual Teleport" uses a multip...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things’). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing? IoT is not about the devices, it’s about the data consumed and generated. The devices are tools, mechanisms, conduits. In his session at Internet of Things at Cloud Expo | DXWor...
Leading companies, from the Global Fortune 500 to the smallest companies, are adopting hybrid cloud as the path to business advantage. Hybrid cloud depends on cloud services and on-premises infrastructure working in unison. Successful implementations require new levels of data mobility, enabled by an automated and seamless flow across on-premises and cloud resources. In his general session at 21st Cloud Expo, Greg Tevis, an IBM Storage Software Technical Strategist and Customer Solution Architec...
SYS-CON Events announced today that Evatronix will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Evatronix SA offers comprehensive solutions in the design and implementation of electronic systems, in CAD / CAM deployment, and also is a designer and manufacturer of advanced 3D scanners for professional applications.
"IBM is really all in on blockchain. We take a look at sort of the history of blockchain ledger technologies. It started out with bitcoin, Ethereum, and IBM evaluated these particular blockchain technologies and found they were anonymous and permissionless and that many companies were looking for permissioned blockchain," stated René Bostic, Technical VP of the IBM Cloud Unit in North America, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventi...
To get the most out of their data, successful companies are not focusing on queries and data lakes, they are actively integrating analytics into their operations with a data-first application development approach. Real-time adjustments to improve revenues, reduce costs, or mitigate risk rely on applications that minimize latency on a variety of data sources. In his session at @BigDataExpo, Jack Norris, Senior Vice President, Data and Applications at MapR Technologies, reviewed best practices to ...
A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, whic...
In his session at 21st Cloud Expo, Carl J. Levine, Senior Technical Evangelist for NS1, will objectively discuss how DNS is used to solve Digital Transformation challenges in large SaaS applications, CDNs, AdTech platforms, and other demanding use cases. Carl J. Levine is the Senior Technical Evangelist for NS1. A veteran of the Internet Infrastructure space, he has over a decade of experience with startups, networking protocols and Internet infrastructure, combined with the unique ability to it...
An increasing number of companies are creating products that combine data with analytical capabilities. Running interactive queries on Big Data requires complex architectures to store and query data effectively, typically involving data streams, an choosing efficient file format/database and multiple independent systems that are tied together through custom-engineered pipelines. In his session at @BigDataExpo at @ThingsExpo, Tomer Levi, a senior software engineer at Intel’s Advanced Analytics gr...
Everything run by electricity will eventually be connected to the Internet. Get ahead of the Internet of Things revolution. In his session at @ThingsExpo, Akvelon expert and IoT industry leader Sergey Grebnov provided an educational dive into the world of managing your home, workplace and all the devices they contain with the power of machine-based AI and intelligent Bot services for a completely streamlined experience.
Recently, WebRTC has a lot of eyes from market. The use cases of WebRTC are expanding - video chat, online education, online health care etc. Not only for human-to-human communication, but also IoT use cases such as machine to human use cases can be seen recently. One of the typical use-case is remote camera monitoring. With WebRTC, people can have interoperability and flexibility for deploying monitoring service. However, the benefit of WebRTC for IoT is not only its convenience and interopera...
Digital Transformation (DX) is not a "one-size-fits all" strategy. Each organization needs to develop its own unique, long-term DX plan. It must do so by realizing that we now live in a data-driven age, and that technologies such as Cloud Computing, Big Data, the IoT, Cognitive Computing, and Blockchain are only tools. In her general session at 21st Cloud Expo, Rebecca Wanta explained how the strategy must focus on DX and include a commitment from top management to create great IT jobs, monitor ...
Product connectivity goes hand and hand these days with increased use of personal data. New IoT devices are becoming more personalized than ever before. In his session at 22nd Cloud Expo | DXWorld Expo, Nicolas Fierro, CEO of MIMIR Blockchain Solutions, will discuss how in order to protect your data and privacy, IoT applications need to embrace Blockchain technology for a new level of product security never before seen - or needed.
Cloud Expo | DXWorld Expo have announced the conference tracks for Cloud Expo 2018. Cloud Expo will be held June 5-7, 2018, at the Javits Center in New York City, and November 6-8, 2018, at the Santa Clara Convention Center, Santa Clara, CA. Digital Transformation (DX) is a major focus with the introduction of DX Expo within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive ov...
"Digital transformation - what we knew about it in the past has been redefined. Automation is going to play such a huge role in that because the culture, the technology, and the business operations are being shifted now," stated Brian Boeggeman, VP of Alliances & Partnerships at Ayehu, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"Evatronix provides design services to companies that need to integrate the IoT technology in their products but they don't necessarily have the expertise, knowledge and design team to do so," explained Adam Morawiec, VP of Business Development at Evatronix, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
The 22nd International Cloud Expo | 1st DXWorld Expo has announced that its Call for Papers is open. Cloud Expo | DXWorld Expo, to be held June 5-7, 2018, at the Javits Center in New York, NY, brings together Cloud Computing, Digital Transformation, Big Data, Internet of Things, DevOps, Machine Learning and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding busin...