0% found this document useful (0 votes)

2 views

Lecture9 ML Introduction.pptx

The document provides an overview of Machine Learning (ML), a subset of Artificial Intelligence (AI), detailing its definitions, types (supervised, unsupervised, reinforcement), and applications. It emphasizes the importance of data transformation and preparation in ML projects, outlining techniques such as data cleaning, normalization, and feature extraction. Additionally, it discusses the lifecycle of ML, including data splitting and dataset selection criteria.

Uploaded by

Alaaeee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Lecture9 ML Introduction.pptx

Uploaded by

Alaaeee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

Artificial Intelligence (AI):

Machine Learning
Machine learning
🡪 Machine learning is a subset of AI and focuses on the
ability of machines to receive a set of data and learn for
themselves, changing algorithms as they learn more about
the information they are processing.
Machine Learning definition
🡪 Arthur Samuel (1959). Machine Learning: Field
of study that gives computers the ability to
learn without being explicitly programmed.
Machine Learning definition
🡪 Tom Mitchell (1998) Well-posed Learning
Problem: A computer program is said to learn
from experience E with respect to some
task T and some performance measure P, if
its performance on T, as measured by P,
improves with experience E.
“A computer program is said to learn from experience E with respect to
some task T and some performance measure P, if its performance on T,
as measured by P, improves with experience E.”

Suppose your email program watches which emails you do or do

not mark as spam, and based on that learns how to better filter
spam. What is the task T in this setting?
Classifying emails as spam or not spam.
Watching you label emails as spam or not spam.

The number (or fraction) of emails correctly classified as spam/not spam.

“A computer program is said to learn from experience E with respect to
some task T and some performance measure P, if its performance on T,
as measured by P, improves with experience E.”

Suppose your email program watches which emails you do or do

not mark as spam, and based on that learns how to better filter
spam. What is the task T in this setting?
Classifying emails as spam or not spam. 🡪 (T)
Watching you label emails as spam or not spam. 🡪 (E)

The number (or fraction) of emails correctly classified as spam/not spam.

🡪 (P)
Machine Learning Applications
🡪 Examples:
🡪 Database mining, Email filtering, Web search engin
🡪 handwriting recognition, most of Natural Language
Processing (NLP), Computer Vision.
🡪 Self-customizing programs
🡪 E.g., Amazon, Netflix product recommendations
Machine learning algorithms:

🡪 Supervised learning
🡪 Unsupervised learning
🡪 Reinforcement learning,

8
Supervised learning
🡪 Supervised learning algorithm learns from labeled
training data (It means some data is already tagged with
the correct answer), helps you to predict outcomes for
unforeseen data.
🡪 Machines are fed with data such as characteristics,
patterns, dimensions, color and height of objects, people
or situations repetitively until the machines are able to
perform accurate output-prediction or classifications.
Supervised Learning
🡪 1- Regression
🡪 Predict continues valued output.
● Example: weather prediction, Predicting house prices
Supervised Learning
🡪 2-Classification
🡪 Estimate discrete valued output
🡪 Examples would you address using classification
🡪 Given email labeled as spam/not spam, learn a spam filter.
🡪 Given a dataset of patients diagnosed as either having diabetes or not,
learn to classify new patients as having diabetes or not.
Supervised learning: classification

Example2: Cancer Diagnosis (malignant, benign)

🡪 One feature or variable (Tumor size)

1(Y)

Malignant?

0(N)
Tumor Size
Supervised learning

Example2: Cancer Diagnosis (malignant, benign)

🡪 One feature or variable (Tumor size)

Othe features
1(Y) - Clump Thickness
-Uniformity of Cell Size
Malignant?
-Uniformity of Cell Shape
0(N) …
Tumor Size
Supervised learning
🡪 Cancer Diagnosis
🡪 It is a classification problem
🡪 Discrete valued output (0 or 1) two classes
🡪 The output can be more the two options or classes
◻ 0 🡪 benign
◻ 1 🡪 Type 1 Cancer
◻ 2 🡪 Type 2 Cancer
◻ 3 🡪 Type 3 Cancer
Example
🡪 You’re running a company, and you want to develop learning algorithms
to address each of the following problems.
🡪 Should you treat these as classification or as regression problems?
🡪 Problem 1:You have a large inventory of identical items. You want to
predict how many of these items will sell over the next 3 months.

🡪 Problem 2:You’d like software to examine individual customer

accounts, and for each account decide if it has been
hacked/compromised.
🡪
Example
🡪 You’re running a company, and you want to develop learning algorithms
to address each of the following problems.
🡪 Should you treat these as classification or as regression problems?
🡪 Problem 1:You have a large inventory of identical items. You want to
predict how many of these items will sell over the next 3 months.
🡪 (Regression)
🡪 Problem 2:You’d like software to examine individual customer
accounts, and for each account decide if it has been
hacked/compromised.
🡪 (classification)
Unsupervised learning
🡪 Unsupervised learning is modeling the underlying or
hidden structure or distribution in the data in order to
learn more about the data.
🡪 Unsupervised learning is where you only have input data
and no corresponding output variables (unlabelled
data.
🡪 you need to allow the model to work on its own to
discover information.
Unsupervised learning
🡪 Examples:
🡪 Given a database of customer data, automatically discover
market segments and group customers into different market
segments.
🡪 Given a set of news articles found on the web, group them
into set of articles about the same story.
Organize computing clusters Social network analysis

Market segmentation
Supervised vs. Unsupervised machine
learning
Reinforcement Learning
● (RL) involves training an agent to make a sequence of
decisions by interacting with an environment.
● The agent learns to achieve a goal by maximizing
cumulative rewards through trial and error.
● Applications:
− Training a robot to navigate a maze.
− Developing a game-playing AI (e.g., AlphaGo).
− Optimizing strategies in dynamic pricing.
● Hybrid Approaches

● Many real-world applications involve a combination of

supervised and unsupervised learning:
● Semi-Supervised Learning: Uses a small amount of
labeled data with a large amount of unlabeled data to
improve learning accuracy.
● Reinforcement Learning: Often uses supervised
learning for policy learning but also explores the
environment in an unsupervised manner.
ML lifeCycle
Data
🡪 Data is the heart of every machine Learning Algorithm
🡪 Data comes in all shapes and sizes: from images to text
to time series data.
🡪 A simple Excel spreadsheet might have data in a few
columns, while a more complex BigQuery dataset could
have millions of rows and thousands of columns.
🡪 No matter the format, though, all data has to be
transformed before it can be used in a machine learning
(ML) project..
Data transformation
🡪 Data transformation is also known as data preparation
or data preprocessing.
🡪 It makes sure that your data is clean and ready to be
used by your machine learning algorithm. Without data
transformation, your AI won’t be able to make accurate
predictions.
Data transformation
🡪 There are many different types of data transformation,
depending on what kind of data you have and what you
want to do with it. Some common types include:
🡪 Data cleaning ( remove irrelevant data)
🡪 Feature extraction
🡪 Feature creation
🡪 Data normalization
🡪 Data aggregation/disaggregation
1-Data cleaning
● removing incorrect or incomplete information from your dataset
● adding or fixing missing values,
● dealing with outliers or extreme values.
● Each Data column should be in proper format e.g Date
● it’s often the most time-consuming.
− These errors can happen for a number of reasons, including
● human error, software bugs, or simply because data is missing in
the original source.

●
Feature extraction
- It is the process of reducing a large amount of
information down to a smaller set of more useful
variables (data reduction)
It’s used to make working with data easier and to
improve the accuracy of predictions.
Apply techniques like Principal Component Analysis
(PCA) or t-SNE to reduce the number of features while
retaining important information.
Feature Creation (Feature Engineer)
● Feature creation is the process of adding extra
information to your dataset where none existed
previously.
● Feature creation is a type of data augmentation, and
it’s a common technique in machine learning. It’s used to
make use of data that would otherwise be ignored, and it
can improve the accuracy of predictions.
Feature Creation
● For instance, you might have a dataset of photos, but
there’s no information about when each photo was taken.
● Feature creation can be used to add this information to
the dataset. This might be done by looking at the EXIF
data of each photo (this is the data that’s automatically
added by the camera when a photo is taken), or by
cross-referencing with scraped web data.
Data normalization
● Data normalization is the process of making sure all
values in your dataset are on the same scale. It’s a
common data transformation technique, and it’s often
used when working with numerical data.
● For instance, you might have a dataset with values that are
measured in inches and values that are measured in
centimeters.
● Another column might have a metric that ranges from 0
to 100, and another column might have a metric that
ranges from 0 to 1.
Examples of Data normalization Techniques

1. Min-Max normalization: This technique scales the values of a feature to a

range between 0 and 1. This is done by subtracting the minimum value of
the feature from each value, and then dividing by the range of the feature.
2. Z-score normalization: This technique scales the values of a feature to
have a mean of 0 and a standard deviation of 1. This is done by subtracting
the mean of the feature from each value, and then dividing by the standard
deviation.
Data aggregation
● It is the process of combining multiple datasets into one. It’s a common

data transformation technique, and it’s often used when working with data

from different sources.

● For instance, you might have data from two different surveys, each with

different questions. Data aggregation can be used to combine the two

datasets into one. This way, you can analyze the data from both surveys

together.
Data disaggregation
● Data disaggregation is the opposite of data aggregation. It’s the

process of splitting one large dataset into several smaller ones.

● For instance, you might have data that’s been aggregated by

country. Data disaggregation can be used to split this dataset

into smaller datasets, one for each country. This way, you can

analyze the data for each country separately.

● Data Augmentation
○ Increase the diversity of the training data without collecting
new data.
■ Common in image processing (e.g., rotating, flipping,
cropping images).
Data Splitting

● Split data into training, validation, and test sets to evaluate model
performance properly.
● Common splits are 70-20-10 or 80-10-10 for training, validation,
and testing, respectively.
● why
- Prevent Overfitting:
- Model Evaluation:
● A separate test set provides an unbiased evaluation
metric to understand
Online Dataset

● Kaggle website
● UCI Machine Learning Repository
● Google Dataset Search
● Data.gov
● AWS Public Datasets
Education Data

● National Center for Education Statistics (NCES): The primary federal

entity for collecting and analyzing education-related data in the U.S. NCES
● EdX and Coursera Datasets: Both platforms provide datasets for
educational purposes and research.
Choosing Dataset
When selecting a dataset, consider the following factors:

● Relevance: Ensure the dataset aligns with your research or project

goals.
● Quality: Check for completeness, accuracy, and the presence of
necessary metadata.
● Size: Consider whether the dataset size is manageable with your
available computational resources.
● Accessibility: Verify that the dataset is accessible and that you have
the necessary permissions to use it.
Python for Machine Learning (Key Libraries and Frameworks)
Deep Learning
Machine Learning
based on Machine Learning course on Coursera
by Andrew Ng
(20) Andrew Ng | LinkedIn
on Youtube
https://youtu.be/gb262LDH1So?si=oRXw28ir6vMGQMFX (original course)
https://www.youtube.com/watch?v=vStJoetOxJg&list=PLkDaE6sCZn6FNC6YRfRQ
c_FbeQrF8BwGI (Current Course)

Data Analytics Using Python
100% (1)
Data Analytics Using Python
982 pages
Machine Learning Notes
100% (10)
Machine Learning Notes
19 pages
Itae006 Test 1 and 2
No ratings yet
Itae006 Test 1 and 2
18 pages
Impact of Business Analytics and Enterprise Systems On Managerial
No ratings yet
Impact of Business Analytics and Enterprise Systems On Managerial
16 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Lect3 Machine Learning
No ratings yet
Lect3 Machine Learning
27 pages
MachineLearning Spring2020 1
No ratings yet
MachineLearning Spring2020 1
69 pages
From Field Problems To Machine Learning
No ratings yet
From Field Problems To Machine Learning
51 pages
20ECE633T Machine Learning in VLSI
No ratings yet
20ECE633T Machine Learning in VLSI
81 pages
01 - ML - Introduction (1)
No ratings yet
01 - ML - Introduction (1)
65 pages
mlintro-2
No ratings yet
mlintro-2
28 pages
1. Machine Learning - Introduction
No ratings yet
1. Machine Learning - Introduction
138 pages
Ch3-Machine Learning
No ratings yet
Ch3-Machine Learning
124 pages
ML -1_Sovan_Introduction to ML
No ratings yet
ML -1_Sovan_Introduction to ML
83 pages
Lecture 1 - Introduction
No ratings yet
Lecture 1 - Introduction
49 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
Unit-1 MLT
No ratings yet
Unit-1 MLT
51 pages
Machine-Learning NOTE2025 2
No ratings yet
Machine-Learning NOTE2025 2
331 pages
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
No ratings yet
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
106 pages
ML 01
No ratings yet
ML 01
15 pages
ML Lec 1
No ratings yet
ML Lec 1
47 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
ML intro
No ratings yet
ML intro
28 pages
1. Machine Learning - Introduction
No ratings yet
1. Machine Learning - Introduction
73 pages
Machine Learning
No ratings yet
Machine Learning
74 pages
Week 12 Intro to DS and ML
No ratings yet
Week 12 Intro to DS and ML
67 pages
An Enlightenment To Machine Learning
100% (1)
An Enlightenment To Machine Learning
16 pages
AA2 Intro ML 2024
No ratings yet
AA2 Intro ML 2024
35 pages
UNit 1 Introduction To ML
No ratings yet
UNit 1 Introduction To ML
225 pages
A.I. Lecture 4 NEW
No ratings yet
A.I. Lecture 4 NEW
31 pages
Air quality prediction using machine learning
No ratings yet
Air quality prediction using machine learning
29 pages
Lecture Notes 1 2 Intro Python
No ratings yet
Lecture Notes 1 2 Intro Python
13 pages
Chapter 5 Machine Learning
No ratings yet
Chapter 5 Machine Learning
96 pages
Workflow of A Machine Learning Project
No ratings yet
Workflow of A Machine Learning Project
12 pages
Unit 1&2
No ratings yet
Unit 1&2
270 pages
Module 1 Notes
No ratings yet
Module 1 Notes
56 pages
Lecture 1.1. Introduction
No ratings yet
Lecture 1.1. Introduction
48 pages
DataScience Unit1 (+notes)
No ratings yet
DataScience Unit1 (+notes)
56 pages
Ch7 Introduction to Machine Learning
No ratings yet
Ch7 Introduction to Machine Learning
29 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
4 pages
Unit 1 - Machine Learning - NOTES1 - ML
No ratings yet
Unit 1 - Machine Learning - NOTES1 - ML
52 pages
Module 1 Notes
No ratings yet
Module 1 Notes
38 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
68 pages
ML Final Print Upload
No ratings yet
ML Final Print Upload
10 pages
ENG6500 1 IntroductionToMLDL Part1
No ratings yet
ENG6500 1 IntroductionToMLDL Part1
63 pages
ML-chap-2
No ratings yet
ML-chap-2
60 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
1_AML _Manish
No ratings yet
1_AML _Manish
72 pages
1 (1)
No ratings yet
1 (1)
10 pages
ML NOTES
No ratings yet
ML NOTES
101 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Presentation on ML - Copy
No ratings yet
Presentation on ML - Copy
469 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
65 pages
Module 1
No ratings yet
Module 1
34 pages
Intro Machine Learning
No ratings yet
Intro Machine Learning
4 pages
Chapter 01 Introduction to ML
No ratings yet
Chapter 01 Introduction to ML
178 pages
WEEK 01 Merged
No ratings yet
WEEK 01 Merged
606 pages
1 - Introduction
No ratings yet
1 - Introduction
82 pages
The Machine Learning Landscape
No ratings yet
The Machine Learning Landscape
30 pages
AI Chapter 5
No ratings yet
AI Chapter 5
31 pages
Machine Learning - ch1
No ratings yet
Machine Learning - ch1
46 pages
EC539 Lecture 10
No ratings yet
EC539 Lecture 10
22 pages
EC539 Lecture 11
No ratings yet
EC539 Lecture 11
14 pages
EC539 Lecture 13
No ratings yet
EC539 Lecture 13
10 pages
EC539 Lecture 12
No ratings yet
EC539 Lecture 12
10 pages
Lecture 1-Introduction
No ratings yet
Lecture 1-Introduction
21 pages
Hilbert Transform
No ratings yet
Hilbert Transform
1 page
Angle Modulation-2
No ratings yet
Angle Modulation-2
46 pages
ICBiasing
No ratings yet
ICBiasing
1 page
CFA Level 2 2024 Learning Outcomes (2)
No ratings yet
CFA Level 2 2024 Learning Outcomes (2)
17 pages
Bioconf Iscku2024 00099
No ratings yet
Bioconf Iscku2024 00099
12 pages
Deep Learning Question Bank
No ratings yet
Deep Learning Question Bank
8 pages
18 Intelligent Methods For Embedded Systems
No ratings yet
18 Intelligent Methods For Embedded Systems
9 pages
Lesson 1
No ratings yet
Lesson 1
37 pages
1 s2.0 S0016003220302544 Main
No ratings yet
1 s2.0 S0016003220302544 Main
22 pages
Lecture 9 H
No ratings yet
Lecture 9 H
69 pages
Artificial intelligence
No ratings yet
Artificial intelligence
7 pages
Advances In Knowledge Discovery And Data Mining 21st Pacificasia Conference Pakdd 2017 Jeju South Korea May 2326 2017 Proceedings Part Ii 1st Edition Jinho Kim instant download
100% (1)
Advances In Knowledge Discovery And Data Mining 21st Pacificasia Conference Pakdd 2017 Jeju South Korea May 2326 2017 Proceedings Part Ii 1st Edition Jinho Kim instant download
91 pages
C-X CH-2 Ai Project Cycle
No ratings yet
C-X CH-2 Ai Project Cycle
7 pages
GM 340
No ratings yet
GM 340
16 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
20 pages
Ai Notes
No ratings yet
Ai Notes
31 pages
Unit-3 ML Mech 3-2
No ratings yet
Unit-3 ML Mech 3-2
16 pages
ITElec2 Act (Finals)
No ratings yet
ITElec2 Act (Finals)
24 pages
data-science-ppt
No ratings yet
data-science-ppt
18 pages
Agriculture Crop Recommendation System Using
No ratings yet
Agriculture Crop Recommendation System Using
57 pages
Top 10 Machine Learning Algorithms
No ratings yet
Top 10 Machine Learning Algorithms
12 pages
Application of Machine Learning in Chemical Engineering: Outlook and Perspectives
No ratings yet
Application of Machine Learning in Chemical Engineering: Outlook and Perspectives
12 pages
A Review On Machine Learning For EEG Signal Processing in Bioengineering
No ratings yet
A Review On Machine Learning For EEG Signal Processing in Bioengineering
15 pages
Artificial Neural Network - Wikipedia
No ratings yet
Artificial Neural Network - Wikipedia
14 pages
Credit Card Fraud Detection Using Machine Learning
No ratings yet
Credit Card Fraud Detection Using Machine Learning
6 pages
A Novel Multi-Stage Approach For Hierarchical Intrusion Detection
No ratings yet
A Novel Multi-Stage Approach For Hierarchical Intrusion Detection
15 pages
R22 ML SYLLABUS
No ratings yet
R22 ML SYLLABUS
2 pages
Iu 3.6.4 ML 101
No ratings yet
Iu 3.6.4 ML 101
39 pages
A Survey of Deep Learning and Its Applications: A New Paradigm To Machine Learning
No ratings yet
A Survey of Deep Learning and Its Applications: A New Paradigm To Machine Learning
22 pages
Soranson Python-Machine-Learning RuLit Me 683600
No ratings yet
Soranson Python-Machine-Learning RuLit Me 683600
99 pages