03 - IBM Watsonx - Data Introduction For Clients
03 - IBM Watsonx - Data Introduction For Clients
data
Scale AI Workloads
For All Your Data, Anywhere
Content by:
Kevin Shen
Product Manager | Data & AI Software
[email protected]
Anson Kokkat
Product Manager | Data & AI Software
[email protected]
Joshua Kim
Program Director | Data & AI Software
[email protected]
Adam Learmonth
Advisory, Learning Content Development
[email protected]
Presenter:
Ahmad Muzaffar Baharudin
Technical Enablement Specialist | Data & AI
[email protected]
Massive early Broad-reaching Critical focus of AI
adoption & deep impact activity & investment
There’s more data In more locations In more formats With less quality
Exploding data growth Multiple locations, clouds, Documents, images, video Stale and inconsistent
applications and silos
The aggregate volume of data 80% of time is spent on 82% of enterprises say data
stored is set to grow over 82% of enterprises are data cleaning, integration quality is a barrier on their
250% in the next 5 years. inhibited by data silos. and preparation. data integration projects.
Source: https://www.idc.com/getdoc.jsp?containerId=US49018922) 3
Traditional approaches to addressing these challenges have created more overall
complexity and cost, which has led to the emergence of data lakehouse architectures
Early 2000s
4
1
Enterprise leaders Ability to scale AI while supporting
require a data compliance with lineage and
architecture that can reproducibility of data
provide quick access
to data, centralized
governance and
2
fit-for-purpose use. Real-time analytics and BI that can
connect to existing data in minutes
without expensive duplicating
or moving of data
3
Data sharing and self-service access
for more users and more data while
strengthening governance and security
5
Introducing…
watsonx
6
The platform
for AI and data watsonx.ai watsonx.data watsonx.governance
Build, train, validate, tune and Scale AI workloads, for all Accelerate responsible,
deploy AI models your data, anywhere transparent and explainable AI
workflows
7
IBM watsonx Enable fine-tuned models to be
managed through market leading
governance and lifecycle
The platform management capabilities
Leverage foundation
for AI and data models to automate data
search, discovery, and
linking in watsonx.data
Scale and watsonx.governance
accelerate the
impact of AI watsonx.ai
with trusted data.
watsonx.data
1 Prompting
4
Training from
scratch
8
Put AI to work with watsonx
Scale and accelerate the impact of AI with trusted data
9
What IBM offers
10
watsonx.data
11
watsonx.data
Hybrid Cloud Built-in Governance Open Source
Scale AI workloads, Access all your data Get started in minutes Reduce the cost of
for all your data, through a single point with built-in a data warehouse
anywhere of entry across all governance, security by up to 50%*
clouds and on-premises and automation. through workload
A fit-for-purpose data environments. optimization across
store, based on an open multiple query engines
lakehouse architecture, and storage tiers.
supported by querying,
governance and open
data formats to access
and share data
*When comparing published 2023 list prices normalized for VPC hours of IBM watsonx.data to several major cloud data
warehouse vendors. Savings may vary depending on configurations, workloads and vendors. 12
Access all your data across
hybrid cloud through a
single point of entry
An open data store, based on an
open lakehouse architecture built
for hybrid deployment of your data,
analytics, and AI workloads
13
watsonx.data
Hybrid Cloud Built-in Governance Open Source
Scale AI workloads, Access all your data Get started in minutes Reduce the cost of
for all your data, through a single point with built-in a data warehouse
anywhere of entry across all governance, security by up to 50%*
clouds and on-premises and automation. through workload
A fit-for-purpose data environments. optimization across
store, based on an open multiple query engines
lakehouse architecture, and storage tiers.
supported by querying,
governance and open
data formats to access
and share data
*When comparing published 2023 list prices normalized for VPC hours of IBM watsonx.data to several major cloud data
warehouse vendors. Savings may vary depending on configurations, workloads and vendors. 14
Get started in minutes Connect to your existing analytics data and deploy
fit-for-purpose query engines in minutes
15
watsonx.data
Hybrid Cloud Built-in Governance Open Source
Scale AI workloads, Access all your data Get started in minutes Reduce the cost of
for all your data, through a single point with built-in a data warehouse
anywhere of entry across all governance, security by up to 50%*
clouds and on-premises and automation. through workload
A fit-for-purpose data environments. optimization across
store, based on an open multiple query engines
lakehouse architecture, and storage tiers.
supported by querying,
governance and open
data formats to access
and share data
*When comparing published 2023 list prices normalized for VPC hours of IBM watsonx.data to several major cloud data
warehouse vendors. Savings may vary depending on configurations, workloads and vendors. 16
Reduce your data
warehouse costs by
up to 50%* by
optimizing workloads
Optimize workloads from your data
warehouse when you take advantage
of low-cost object storage and
fit-for-purpose query engines
17
Access all your data, quickly and optimize your data architecture with multi-engine
support and hybrid deployment of analytics and AI workloads
1 3
Public cloud
IBM watsonx.data
3 4
Types of workloads
Structured Unstructured
Technology
Proprietary Open
18
The IBM approach to Best-in-class cost Built-in integrations with Deep expertise and
a data lakehouse and performance IBM data repositories capabilities in data
architecture combines optimizations for and data fabric and storage
the best of IBM with compute and storage
the best of open source
19
Effortlessly populate with trusted data leveraging best-in-class
data ingestion and observability
watsonx.data
Continuously detect and resolve data quality incidents
1 3
I ngest data into Continuously
•
3 Monitor, detect, and resolve data quality incidents
storage layer; monitor and Monitor and improve the health of DataStage, Spark, or Python
use fit-for-
purpose
Metadata store improve health
of Spark/Python
pipeline workloads running on watsonx.data; detect data anomalies
Access control management
watsonx.data pipelines and accelerate issue resolution
query engines
for BI /reporting/
data science
20
Overview of the key components of IBM watsonx.data: Multiple query
engines, open table formats, and built-in enterprise governance
watsonx.data
Your existing
Data warehouse Data lake
ecosystem
Governance
and metadata Metadata store
Access control management
Data format
Storage
Infrastructure
22
watsonx.data
23
Use cases
24
Powered by
Over 2000 daily reports and Up to 100,000 daily queries 30,000 queries per day with
100s of pipelines on a 7 PB (over 1.5 million queries per 1000 daily active users on
data lake with over 400 month) with over 2000 active a 300 PB data lake
billion records internal users on 2 PB data lake
Over 500,000 queries per day Over 2 million queries per Over 2700 active internal
with 7000 weekly active users day for business intelligence users running 1 million
on a 50 PB data lake and one-off use cases queries scanning 40 PB
of data per month
25
watsonx.data “We look forward to partnering "We’re excited to see
with IBM to optimize the how watsonx can help us
watsonx.data stack and drive predictive analytics,
contributing to the open- identify fraud, and optimize
is helping companies source community.” our marketing.”
scale their AI workloads.
Why IBM?
28
Getting started
29
Three ways to get started with watsonx.data today
IBM’s investment in partnering with clients
30
© 2023 International Business Machines Corporation
Thank you
IBM and the IBM logo are trademarks of IBM
Corporation, registered in many jurisdictions
worldwide. Other product and service names might be
trademarks of IBM or other companies. A current list
of IBM trademarks is available on ibm.com/trademark.
31