The Wayback Machine - https://web.archive.org/web/20190103015825/http://openwebdeveloper.sys-con.com/node/4157284

Welcome!

Release Management Authors: Pat Romanski, Elizabeth White, David H Deans, Liz McMillan, Jnan Dash

Related Topics: @DXWorldExpo, Artificial Intelligence, @ThingsExpo

@DXWorldExpo: Blog Post

“Unlearn” to Unleash Your #DataLake | @CloudExpo #BigData #AI #DigitalTransformation

The Data Science Process is about exploring, experimenting, and testing new data sources and analytic tools quickly

It takes years – sometimes a lifetime – to perfect certain skills in life: hitting a jump shot off the dribble, nailing that double high C on the trumpet, parallel parking a Ford Expedition. Malcolm Gladwell wrote a book, “Outliers,” discussing the amount of work – 10,000 hours – required to perfect a skill (while the exactness of 10,000 hours has come under debate, it is still a useful point that people need to invest considerable time and effort to master a skill). But once we get comfortable with something that we feel that we have mastered, we become reluctant to change. We are reluctant to unlearn what we’ve taken so long to master.

Changing your point of release on a jump shot or your embouchure for playing lead trumpet is dang hard! Why? Because it is harder to unlearn that it is to learn. It is harder to un-wire all those synoptic nerve endings and deep memories than it was to wire them in the first place. It’s not just a case of thinking faster, smaller or cheaper; it necessitates thinking differently.

For example, why did it take professional basketball so long to understand the game changing potential of the 3-point shot? The 3-point shot was added to the NBA during the 1979-1980 season, but for decades the 3-point shot was more a novelty then a serious game strategy. Pat Riley, the legendary coach of the 3-pointer’s first decade in the league (won NBA Championships in 1982, 1985, 1987 and 1988), called it a “gimmick.” Larry Bird, one of that era’s top players said: “I really don’t like it.”

It’s only been within the past 3 years where the “economics of the 3-point shot” have changed the fundamentals of how to win an NBA Championship (see Figure 1).

Figure 1: NBA 3-point Baskets per Season

NBA Coaches and General Managers just didn’t comprehend the “economics of the 3-point shot” and how the 3-point shot could turn a good shooter into a dominant player; that a 40% 3-point shooting percentage is equivalent to a 60% 2-point shooting percentage from a points / productivity perspective. The economics of the 3-point shot (coupled with rapid ball movement to create uncontested 3-point shots) wasn’t full exploited until the 2015-2016 season by the Golden State Warriors. Their success over the past 3 seasons (3 trips to the NBA finals with 2 championships) shows how much the game of basketball has been changed.

Sometimes it’s necessary to unlearn long held beliefs (i.e. 2-point shooting in a predominately isolation offense game) in order to learn new, more powerful, game changing beliefs (i.e., 3-point shooting in a rapid ball movement offense).

Sticking with our NBA example, Phil Jackson is considered one of the greatest NBA coaches, with 11 NBA World Championships coaching the Chicago Bulls and the Los Angeles Lakers. Phil Jackson mastered the “Triangle Offense” that played to the strengths of the then dominant players Michael Jordan (Chicago Bulls) and Kobe Bryant (Los Angeles Lakers) to win those 11 titles.

However, the game passed Phil Jackson as the economics of the 3-point shot changed how to win. Jackson’s tried-and-true “Triangle Offense” failed with the New York Knicks leading to the team’s dramatic under-performance and ultimately his firing. It serves as a stark reminder of how important it is to be ready to unlearn old skills in order to move forward.

And what holds true for sports, holds even more so for technology and business.

The Challenge of Unlearning
For the first two decades of my career, I worked to perfect the art of data warehousing. I was fortunate to be at Metaphor Computers in the 1980’s where we refined the art of dimensional modeling and star schemas. I had many years working to perfect my star schema and dimensional modeling skills with data warehouse luminaries like Ralph Kimball, Margy Ross, Warren Thornthwaite, and Bob Becker. It became engrained in every customer conversation; I’d built a star schema and the conformed dimensions in my head as the client explained their data analysis requirements.

Then Yahoo happened to me and soon everything that I held as absolute truth was turned upside down. I was thrown into a brave new world of analytics based upon petabytes of semi-structured and unstructured data, hundreds of millions of customers with 70 to 80 dimensions and hundreds of metrics, and the need to make campaign decisions in fractions of a second. There was no way that my batch “slice and dice” business intelligence and highly structured data warehouse approach was going to work in this brave new world of real-time, predictive and prescriptive analytics.

I struggled to unlearn engrained data warehousing concepts in order to embrace this new real-time, predictive and prescriptive world. And this is one of the biggest challenge facing IT leaders today – how to unlearn what they’ve held as gospel and embrace what is new and different. And nowhere do I see that challenge more evident then when I’m discussing Data Science and the Data Lake.

Embracing The “Art of Failure” and The Data Science Process
Nowadays, Chief Information Officers (CIOs) are being asked to lead the digital transformation from a batch world that uses data and analytics to monitor the business to a real-time world that exploits internal and external, structured and unstructured data, to predict what is likely to happen and prescribe recommendations. To power this transition, CIO’s must embrace a new approach for deriving customer, product, and operational insights – the Data Science Process (see Figure 2).

Figure 2:  Data Science Engagement Process

The Data Science Process is about exploring, experimenting, and testing new data sources and analytic tools quickly, failing fast but learning faster. The Data Science process requires business leaders to get comfortable with “good enough” and failing enough times before one becomes comfortable with the analytic results. Predictions are not a perfect world with 100% accuracy. As Yogi Berra famously stated:

“It’s tough to make predictions, especially about the future.”

This highly iterative, fail-fast-but-learn-faster process is the heart of digital transformation – to uncover new customer, product, and operational insights that can optimize key business and operational processes, mitigate regulatory and compliance risks, uncover new revenue streams and create a more compelling, more prescriptive customer engagement. And the platform that is enabling digital transformation is the Data Lake.

The Power of the Data Lake
The data lake exploits the economics of big data; coupling commodity, low-cost servers and storage with open source tools and technologies, is 50x to 100x cheaper to store, manage and analyze data then using traditional, proprietary data warehousing technologies. However, it’s not just cost that makes the data lake a more compelling platform than the data warehouse. The data lake also provides a new way to power the business, based upon new data and analytics capabilities, agility, speed, and flexibility (see Table 1).

Data Warehouse Data Lake
Data structured in heavily-engineered structured dimensional schemas Data structured as-is (structured, semi-structured, and unstructured formats)
Heavily-engineered, pre-processed data ingestion Rapid as-is data ingestion
Generates retrospective reports from historical, operational data sources Generates predictions and prescriptions from a wide variety of internal and external data sources
100% accurate results of past events and performance “Good enough” predictions of future events and performance
Schema-on-load to support the historical reporting on what the business did Schema-on-query to support the rapid data exploration and hypothesis testing
Extremely difficult to ingest and explore new data sources (measured in weeks or months) Easy and fast to ingest and explore new data sources (measured in hours or days)
Monolithic design and implementation (water fall) Natively parallel scale out design and implementation (scrum)
Expensive and proprietary Cheap and open source
Widespread data proliferation (data warehouses and data marts) Single managed source of organizational data
Rigid; hard to change Agile; relatively ease to change

Table 1:  Data Warehouse versus Data Lake

The data lake supports the unique requirements of the data science team to:

  • Rapidly explore and vet new structured and unstructured data sources
  • Experiment with new analytics algorithms and techniques
  • Quantify cause and effect
  • Measure goodness of fit

The data science team needs to be able perform this cycle in hours or days, not weeks or months. The data warehouse cannot support these data science requirements. The data warehouse cannot rapidly exploration the internal and external structured and unstructured data sources. The data warehouse cannot leverage the growing field of deep learning/machine learning/artificial intelligence tools to quantify cause-and-effect. Thinking that the data lake is “cold storage for our data warehouse” – as one data warehouse expert told me – misses the bigger opportunity. That’s yesterday’s “triangle offense” thinking. The world has changed, and just like how the game of basketball is being changed by the “economics of the 3-point shot,” business models are being changed by the “economics of big data.”

But a data lake is more than just a technology stack. To truly exploit the economic potential of the organization’s data, the data lake must come with data management services covering data accuracy, quality, security, completeness and governance. See “Data Lake Plumbers: Operationalizing the Data Lake” for more details (see Figure 3).

Figure 3:  Components of a Data Lake

If the data lake is only going to be used another data repository, then go ahead and toss your data into your unmanageable gaggle of data warehouses and data marts.

BUT if you are looking to exploit the unique characteristics of data and analytics –assets that never deplete, never wear out and can be used across an infinite number of use cases at zero marginal cost – then the data lake is your “collaborative value creation” platform. The data lake becomes that platform that supports the capture, refinement, protection and re-use of your data and analytic assets across the organization.

But one must be ready to unlearn what they held as the gospel truth with respect to data and analytics; to be ready to throw away what they have mastered to embrace new concepts, technologies, and approaches. It’s challenging, but the economics of big data are too compelling to ignore. In the end, the transition will be enlightening and rewarding. I know, because I have made that journey.

The post “Unlearn” to Unleash Your Data Lake appeared first on InFocus Blog | Dell EMC Services.


DXWorldEXPO LLC, the producer of the world's most influential technology conferences and trade shows has announced the conference tracks for CloudEXPO | DXWorldEXPO 2018 New York.

DXWordEXPO New York 2018, colocated with CloudEXPO New York 2018 will be held November 11-13, 2018, in New York City.

Digital Transformation (DX) is a major focus with the introduction of DXWorldEXPO within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term.

A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throughout enterprises of all sizes.

Register for Full Conference "Gold Pass" ▸ Here (Expo Hall ▸ Here)

Sponsorship Opportunities Here

Speaking Opportunities Here

Sponsorship and Speaking Inquiries: [email protected].

2018 Conference Agenda, Keynotes and 10 Conference Tracks

DXWordEXPO New York 2018 and Cloud Expo New York 2018 agenda present 222 rockstar faculty members, 200 sessions and 22 keynotes and general sessions in 10 distinct conference tracks.

  • Cloud-Native | Serverless
  • DevOpsSummit
  • FinTechEXPO - New York Blockchain Event
  • CloudEXPO - Enterprise Cloud
  • DXWorldEXPO - Digital Transformation (DX)
  • Smart Cities | IoT | IIoT
  • AI | Machine Learning | Cognitive Computing
  • BigData | Analytics
  • The API Enterprise | Mobility | Security
  • Hot Topics | FinTech | WebRTC

Register for Full Conference "Gold Pass" ▸ Here (Expo Hall ▸ Here)

DXWorldEXPO | CloudEXPO 2018 New York cover all of these tools, with the most comprehensive program and with 222 rockstar speakers throughout our industry presenting 22 Keynotes and General Sessions, 200 Breakout Sessions along 10 Tracks, as well as our signature Power Panels. Our Expo Floor brings together the world's leading companies throughout the world of Cloud Computing, DevOps, FinTech, Digital Transformation, and all they entail.

As your enterprise creates a vision and strategy that enables you to create your own unique, long-term success, learning about all the technologies involved is essential. Companies today not only form multi-cloud and hybrid cloud architectures, but create them with built-in cognitive capabilities.

Cloud-Native thinking is now the norm in financial services, manufacturing, telco, healthcare, transportation, energy, media, entertainment, retail and other consumer industries, as well as the public sector.

CloudEXPO is the world's most influential technology event where Cloud Computing was coined over a decade ago and where technology buyers and vendors meet to experience and discuss the big picture of Digital Transformation and all of the strategies, tactics, and tools they need to realize their goals.

FinTech Is Now Part of the DXWorldEXPO | CloudEXPO Program!

Financial enterprises in New York City, London, Singapore, and other world financial capitals are embracing a new generation of smart, automated FinTech that eliminates many cumbersome, slow, and expensive intermediate processes from their businesses.

Accordingly, attendees at the upcoming 22nd CloudEXPO | DXWorldEXPO November 11-13, 2018 in New York City will find fresh new content in two new tracks called:

  • FinTechEXPO
  • New York Blockchain Event

which will incorporate FinTech and Blockchain, as well as machine learning, artificial intelligence and deep learning in these two distinct tracks.

Register for Full Conference "Gold Pass" ▸ Here (Expo Hall ▸ Here)

Sponsorship Opportunities Here

Speaking Opportunities Here

Sponsorship and Speaking Inquiries: [email protected].

FinTech brings efficiency as well as the ability to deliver new services and a much improved customer experience throughout the global financial services industry. FinTech is a natural fit with cloud computing, as new services are quickly developed, deployed, and scaled on public, private, and hybrid clouds.

More than US$20 billion in venture capital is being invested in FinTech this year. DXWorldEXPOCloudEXPO are pleased to bring you the latest FinTech developments as an integral part of our program.

DXWorldEXPO | CloudEXPO are accepting speaking submissions for this new track, so please visit Cloud Computing Expo for the latest information or contact us at [email protected]

Register for Full Conference "Gold Pass" ▸ Here (Expo Hall ▸ Here)

Sponsorship Opportunities Here

Speaking Opportunities Here

Sponsorship and Speaking Inquiries: [email protected].

Download Slide Deck ▸ Here

Only DXWorldEXPO | CloudEXPO bring together all this in a single location:

Attend DXWorldEXPO | CloudEXPO. Build your own custom experience. Learn about the world's latest technologies and chart your course to Digital Transformation.

22nd International DXWorldEXPO | CloudEXPO, taking place November 11-13, 2018, in New York City, will feature technical sessions from a rock star conference faculty and the leading industry players in the world.

Register for Full Conference "Gold Pass" ▸ Here (Expo Hall ▸ Here)

Sponsorship Opportunities Here

Speaking Opportunities Here

Sponsorship and Speaking Inquiries: in[email protected].

Download Slide Deck: ▸ Here

Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises are using some form of XaaS - software, platform, and infrastructure as a service.

With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.

Every Global 2000 enterprise in the world is now integrating cloud computing in some form into its IT development and operations. Midsize and small businesses are also migrating to the cloud in increasing numbers.

Register for Full Conference "Gold Pass" ▸ Here (Expo Hall ▸ Here)

Sponsorship Opportunities Here

Speaking Opportunities Here

Sponsorship and Speaking Inquiries: [email protected].

Download Slide Deck: ▸ Here

Companies are each developing their unique mix of cloud technologies and services, forming multi-cloud and hybrid cloud architectures and deployments across all major industries. Cloud-driven thinking has become the norm in financial services, manufacturing, telco, healthcare, transportation, energy, media, entertainment, retail and other consumer industries, and the public sector.

Sponsorship Opportunities

DXWorldEXPO | CloudEXPO are the single show where technology buyers and vendors can meet to experience and discus cloud computing and all that it entails. Sponsors of DXWorldEXPO | CloudEXPO will benefit from unmatched branding, profile building and lead generation opportunities through:

  • Featured on-site presentation and ongoing on-demand webcast exposure to a captive audience of industry decision-makers.
  • Showcase exhibition during our new extended dedicated expo hours
  • Breakout Session Priority scheduling for Sponsors that have been guaranteed a 35-minute technical session
  • Online advertising on 4,5 million article pages in SYS-CON's i-Technology Publications
  • Capitalize on our Comprehensive Marketing efforts leading up to the show with print mailings, e-newsletters and extensive online media coverage.
  • Unprecedented PR Coverage: Unmatched editorial coverage on Cloud Computing Journal.
  • Tweetup to over 100,000 plus Twitter followers
  • Press releases sent on major wire services to over 500 industry analysts.

Secrets of Our Most Popular Sponsors and Exhibitors ▸ Here

For more information on sponsorship, exhibit, and keynote opportunities, contact [email protected].

Sponsorship Opportunities Here

Download Slide Deck:Here

Speaking Opportunities

The upcoming 22nd International DXWorldEXPO | CloudEXPO November 11-13, 2018 in New York City, NY announces that its Call For Papers for speaking opportunities is now open.

Secrets of Our Most Popular Faculty Members ▸ Here

Submit your speaking proposal Here or by email [email protected].

Download Slide Deck: ▸ Here

About DXWorldEXPO LLC

DXWorldEXPO LLC is a Lighthouse Point, Florida-based trade show company and the creator of DXWorldEXPODigital Transformation Conference & Expo. The company produces and presents CloudEXPO, DevOpsSummitFinTechEXPO Blockchain Event, the world's most influential conferences and trade shows.

Read the original blog entry...

More Stories By William Schmarzo

Bill Schmarzo, author of “Big Data: Understanding How Data Powers Big Business” and “Big Data MBA: Driving Business Strategies with Data Science”, is responsible for setting strategy and defining the Big Data service offerings for Hitachi Vantara as CTO, IoT and Analytics.

Previously, as a CTO within Dell EMC’s 2,000+ person consulting organization, he works with organizations to identify where and how to start their big data journeys. He’s written white papers, is an avid blogger and is a frequent speaker on the use of Big Data and data science to power an organization’s key business initiatives. He is a University of San Francisco School of Management (SOM) Executive Fellow where he teaches the “Big Data MBA” course. Bill also just completed a research paper on “Determining The Economic Value of Data”. Onalytica recently ranked Bill as #4 Big Data Influencer worldwide.

Bill has over three decades of experience in data warehousing, BI and analytics. Bill authored the Vision Workshop methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements. Bill serves on the City of San Jose’s Technology Innovation Board, and on the faculties of The Data Warehouse Institute and Strata.

Previously, Bill was vice president of Analytics at Yahoo where he was responsible for the development of Yahoo’s Advertiser and Website analytics products, including the delivery of “actionable insights” through a holistic user experience. Before that, Bill oversaw the Analytic Applications business unit at Business Objects, including the development, marketing and sales of their industry-defining analytic applications.

Bill holds a Masters Business Administration from University of Iowa and a Bachelor of Science degree in Mathematics, Computer Science and Business Administration from Coe College.

IoT & Smart Cities Stories
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
Digital Transformation is much more than a buzzword. The radical shift to digital mechanisms for almost every process is evident across all industries and verticals. This is often especially true in financial services, where the legacy environment is many times unable to keep up with the rapidly shifting demands of the consumer. The constant pressure to provide complete, omnichannel delivery of customer-facing solutions to meet both regulatory and customer demands is putting enormous pressure on...
IoT is rapidly becoming mainstream as more and more investments are made into the platforms and technology. As this movement continues to expand and gain momentum it creates a massive wall of noise that can be difficult to sift through. Unfortunately, this inevitably makes IoT less approachable for people to get started with and can hamper efforts to integrate this key technology into your own portfolio. There are so many connected products already in place today with many hundreds more on the h...
The standardization of container runtimes and images has sparked the creation of an almost overwhelming number of new open source projects that build on and otherwise work with these specifications. Of course, there's Kubernetes, which orchestrates and manages collections of containers. It was one of the first and best-known examples of projects that make containers truly useful for production use. However, more recently, the container ecosystem has truly exploded. A service mesh like Istio addr...
Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As automation and artificial intelligence (AI) power solution development and delivery, many businesses need to build backend cloud capabilities. Well-poised organizations, marketing smart devices with AI and BlockChain capabilities prepare to refine compliance and regulatory capabilities in 2018. Volumes of health, financial, technical and privacy data, along with tightening compliance requirements by...
Charles Araujo is an industry analyst, internationally recognized authority on the Digital Enterprise and author of The Quantum Age of IT: Why Everything You Know About IT is About to Change. As Principal Analyst with Intellyx, he writes, speaks and advises organizations on how to navigate through this time of disruption. He is also the founder of The Institute for Digital Transformation and a sought after keynote speaker. He has been a regular contributor to both InformationWeek and CIO Insight...
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
To Really Work for Enterprises, MultiCloud Adoption Requires Far Better and Inclusive Cloud Monitoring and Cost Management … But How? Overwhelmingly, even as enterprises have adopted cloud computing and are expanding to multi-cloud computing, IT leaders remain concerned about how to monitor, manage and control costs across hybrid and multi-cloud deployments. It’s clear that traditional IT monitoring and management approaches, designed after all for on-premises data centers, are falling short in ...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...