0% found this document useful (0 votes)

35 views

Module 4

Uploaded by

sushma-icb

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views

Module 4

Uploaded by

sushma-icb

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 28

Module 4

Machine learning (ML)

● Machine learning has become a hot topic today, with entrepreneurs all across the world switching to machine learning for
business operations. Machine learning has reached the advancement where it can even predict outcomes without being explicitly
programmed to do so.

● This field of study uses data and algorithms to mimic human learning, allowing machines to improve over time, becoming
increasingly accurate when making predictions or classifications or uncovering data-driven insights.
● Machine Learning, as the name suggests, provides machines with the ability to learn autonomously
based on experiences, observations and analysing patterns within a given data set without explicitly
programming.

● When we write a program or a code for some specific purpose, we are actually writing a definite set of
instructions which the machine will follow.

● Whereas in machine learning, we input a data set through which the machine learns by identifying and
analysing the patterns in the data set. Then, the machine will make decisions autonomously based on
its observations and learnings from the dataset.
● Machine learning plays an important role in the field of enterprises as it enables entrepreneurs to minimise manual
efforts. The machine learning model learns with the help of humans but eventually, the machine learns and takes over
the learnt task.

● Although a minimum level of intervention is needed for making sure that no “machine-related” glitch arises or for
updating the data inputted.

● Nowadays leading companies like Google, Amazon, Facebook, Tesla, and many more are efficiently utilising these
technologies. Hence, machine learning is proving to become a core part of operation and functioning.
Components of machine learning

Every machine learning algorithm has three components:

● Representation
● Evaluation
● Optimization
Representation
● When we talk about representation in machine learning, we're referring to how a model is structured so that a computer can understand and
use it

● Different types of models, like decision trees, support vector machines (SVMs), and neural networks, each have their own way of organizing
data and making predictions.

● Imagine you're building a house. The blueprint you choose (decision tree, SVM, neural network) determines the layout and design
possibilities (classifiers) you can build. Each type of blueprint (representation) has its strengths and limitations. The range of all possible
designs you can create based on that blueprint is called the hypothesis space. It's like the total set of ideas or designs your model can come up
with.

● So, in essence, the representation you choose for your machine learning model sets the boundaries for what kinds of classifiers (models) it can
learn. Different representations offer different strengths and ways of understanding data, influencing how well your model can solve the
problem at hand.
Evaluation
● When we talk about evaluating how well a machine learning model is performing, we use
evaluation functions or metrics. These are essentially tools that measure different aspects of
the model's predictions compared to the actual outcomes.

● There needs to be a function that measures the performance to know which classifiers are
good and which are bad. This is where the evaluation function comes into play.

● Some examples are accuracy, error rate, precision, recall, F-score, squared error, and
information gain. These functions are also referred to as the objective function or scoring
function.
Optimization
● When we talk about machine learning models, they often have many different ways they can
be configured or "tuned." These configurations are like different settings or choices the
model can make to try to solve a problem

● So, instead of trying every single possibility, we use optimization methods. These methods
are like smart strategies that help the model find the best settings more efficiently.

● Choosing the right optimization method is crucial because it determines how quickly and
effectively the model learns from data. Think of it like finding the best route to a destination:
you could wander randomly or use a map and GPS to find the fastest way.
In machine learning projects, we generally divide the original dataset into training data
and test data. We train our model over a subset of the original dataset, i.e., the training
dataset, and then evaluate whether it can generalize well to the new or unseen dataset or
test set.
The training data is the biggest (in -size) subset of the original dataset, which is used to train or fit the machine
learning model. Firstly, the training data is fed to the ML algorithms, which lets them learn how to make
predictions for the given task.

Once we train the model with the training dataset, it's time to test the model with the test dataset. This dataset
evaluates the performance of the model and ensures that the model can generalize well with the new or unseen
dataset. The test dataset is another subset of original data, which is independent of the training dataset.
Generalization
● Real-world data is inherently complex, encompassing variations, noise, and unpredictable
factors. In the realm of machine learning and data science, the ultimate objective is to develop
models capable of delivering accurate predictions and valuable insights when confronted with
new and unseen data.

● Generalization in machine learning refers to the ability of a trained model to accurately make
predictions on new, unseen data. Generalization is important because the true test of a model's
effectiveness is not how well it performs on the training data, but rather how well it generalizes
to new and unseen data.
A spam email classifier is a great example of generalization in machine learning. Suppose you have a training
dataset containing emails labeled as either spam or not spam and your goal is to build a model that can accurately
classify incoming emails as spam or legitimate based on their content.
Feature Engineering
If we train machine learning models using irrelevant data, even the best machine learning
algorithms won’t help much. Conversely, using well-engineered meaningful features can
achieve superior performance even with a simple machine learning algorithm

Working on feature engineering is especially important when working with traditional

machine learning algorithms, such as regressions, decision trees, support vector machines,
and others that require numeric inputs.
we can divide feature engineering into two components: 1) creating new features
and 2) processing these features to make them work optimally with the machine
learning algorithm under consideration

Feature Engineering is the process of extracting and organizing the important

features from raw data in such a way that it fits the purpose of the machine
learning model. It can be thought of as the art of selecting the important features
and transforming them into refined and meaningful features that suit the needs of
the model.
Validation Methods
● Following the principle of mistrust, no model is considered acceptable until it
has been tested against data it has not seen before. This process is called
validation.

● Example : cross validation or k-fold validation

● The caret package in R makes it easy to incorporate cross-validation into your

ML process.The ML process is commonly referred to as a pipeline.
K-fold validation
53 62 47 50 36 21 25 28 60 32 10 9

k=4
12(total dataset)/4(k)=3

(o1+o2+o3+o4)/k
Performance metrics
1) Confusion Matrix
Precision, recall, and specificity
True Positive: If a person actually has a disease and the model accurately predicts that they have the disease, then it is
called a true positive. (0)

True Negative : If a person does not have the disease and the model predicts “no,” then this is a true negative. (8)
False Positive: If a person does not have the disease (no) but the model predicts “yes”, then this is a false positive.(0)
False Negative: If a person has the disease (yes) but the model predicts “no,” this is a false negative.(2)
Precision(Positive Predictive Value (PPV)) = True Positives/(True Positives +
False Positives)
Recall(True Positive Rate(TPR)) = True Positives/(True Positives + False
Negatives)
Speciﬁcity(True Negative Rate(TNR)) = True Negatives/(True Negatives + False
Positives)
Confusion Matrix Filled Confusion
Format Matrix

Precision

Recall

A Tutorial On Multilabel Learning
No ratings yet
A Tutorial On Multilabel Learning
38 pages
Mastering Machine Learning: A Comprehensive Guide to Success
From Everand
Mastering Machine Learning: A Comprehensive Guide to Success
Rick Spair
No ratings yet
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
ML 02 Dataset-Feature Selection PDF
No ratings yet
ML 02 Dataset-Feature Selection PDF
44 pages
Unit III
No ratings yet
Unit III
19 pages
Air quality prediction using machine learning
No ratings yet
Air quality prediction using machine learning
29 pages
DAIOT UNIT 5 (1) Own
No ratings yet
DAIOT UNIT 5 (1) Own
13 pages
AIch5 (2)
No ratings yet
AIch5 (2)
50 pages
01 - Introduction
No ratings yet
01 - Introduction
35 pages
Module 3 Data Science Machine Learning
No ratings yet
Module 3 Data Science Machine Learning
53 pages
Machine Learning
No ratings yet
Machine Learning
57 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
LECTURE-2
No ratings yet
LECTURE-2
36 pages
An Enlightenment To Machine Learning
100% (1)
An Enlightenment To Machine Learning
16 pages
Machine Learning Notes (1)
No ratings yet
Machine Learning Notes (1)
19 pages
Domingos
No ratings yet
Domingos
9 pages
machineLearning-unit1
No ratings yet
machineLearning-unit1
9 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
32 pages
Fin Irjmets1652378206
No ratings yet
Fin Irjmets1652378206
6 pages
An Enlightenment To Machine Learning - Resp
No ratings yet
An Enlightenment To Machine Learning - Resp
22 pages
ML-chap-2
No ratings yet
ML-chap-2
60 pages
ML Lecture Notes Unit-1
No ratings yet
ML Lecture Notes Unit-1
45 pages
Machine Learning
No ratings yet
Machine Learning
24 pages
Unit I
No ratings yet
Unit I
150 pages
presenttion33
No ratings yet
presenttion33
2 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
9 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Unit 5 Intro To Machine Learning
No ratings yet
Unit 5 Intro To Machine Learning
25 pages
Lecture 4.2 Supervised Learning Classification
No ratings yet
Lecture 4.2 Supervised Learning Classification
25 pages
MLE
No ratings yet
MLE
15 pages
AI Unit 1
No ratings yet
AI Unit 1
30 pages
Machine - Learning - Unit - 1
No ratings yet
Machine - Learning - Unit - 1
70 pages
Beyond The Algorithm: Practical Machine Learning Strategies
From Everand
Beyond The Algorithm: Practical Machine Learning Strategies
Jane Onwuchekwa
No ratings yet
Machine Learning Basics
No ratings yet
Machine Learning Basics
9 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
Fire extinguisher prediction using machine learning report
No ratings yet
Fire extinguisher prediction using machine learning report
48 pages
Notes Unit 1-3 Part-II
No ratings yet
Notes Unit 1-3 Part-II
20 pages
03-Introduction To Machine Learning - DNN
No ratings yet
03-Introduction To Machine Learning - DNN
35 pages
Machinelearning Unit-1
No ratings yet
Machinelearning Unit-1
29 pages
CSC413 Lecture Note
No ratings yet
CSC413 Lecture Note
32 pages
July4 SaketAnand FriendlyIntroToML
No ratings yet
July4 SaketAnand FriendlyIntroToML
84 pages
Introduction to ML Unit-1 PPT
No ratings yet
Introduction to ML Unit-1 PPT
90 pages
ML Unit 1
No ratings yet
ML Unit 1
9 pages
ML1-Introduction To Machine Learning
No ratings yet
ML1-Introduction To Machine Learning
46 pages
Unit 3 - DS - 1st year
No ratings yet
Unit 3 - DS - 1st year
5 pages
MACHINE LEARNING 1-5 (Ai &DS)
100% (1)
MACHINE LEARNING 1-5 (Ai &DS)
60 pages
Advance ML - Unit 1
No ratings yet
Advance ML - Unit 1
12 pages
Introduction Class
No ratings yet
Introduction Class
134 pages
DSF - UNIT III Notes
No ratings yet
DSF - UNIT III Notes
17 pages
CS601_Machine Learning_Unit 1_Notes_1672759748
No ratings yet
CS601_Machine Learning_Unit 1_Notes_1672759748
13 pages
MLP IA1
No ratings yet
MLP IA1
26 pages
Introductiontomachinelearning 230723174746 1a0e5edc
No ratings yet
Introductiontomachinelearning 230723174746 1a0e5edc
27 pages
ML Unit 1
No ratings yet
ML Unit 1
20 pages
Data Science
No ratings yet
Data Science
64 pages
2021 Machine Learning Intro
No ratings yet
2021 Machine Learning Intro
43 pages
ML 01
No ratings yet
ML 01
24 pages
Machine Learning - ch1
No ratings yet
Machine Learning - ch1
46 pages
Machine Learning
No ratings yet
Machine Learning
51 pages
u 1
No ratings yet
u 1
12 pages
Machine Learning Practical File
No ratings yet
Machine Learning Practical File
41 pages
ETI microproject
No ratings yet
ETI microproject
11 pages
Data Analysis and Graphics Using R An Example Based Approach Third Edition John Maindonald - The latest ebook version is now available for instant access
100% (1)
Data Analysis and Graphics Using R An Example Based Approach Third Edition John Maindonald - The latest ebook version is now available for instant access
59 pages
Internship Report Sakshi Barapatre
No ratings yet
Internship Report Sakshi Barapatre
36 pages
Chapter 9
No ratings yet
Chapter 9
3 pages
Project Report
No ratings yet
Project Report
58 pages
Minor Project Synopsis - Dog Breed Identification
No ratings yet
Minor Project Synopsis - Dog Breed Identification
43 pages
Predicting Cytotoxicity of Engineered Nanoparticles Using Regularized Regression Models An in Silico Approach
No ratings yet
Predicting Cytotoxicity of Engineered Nanoparticles Using Regularized Regression Models An in Silico Approach
15 pages
Deep-Learning-Based Membranous Nephropathy Classification and Monte-Carlo Dropout Uncertainty Estimation
No ratings yet
Deep-Learning-Based Membranous Nephropathy Classification and Monte-Carlo Dropout Uncertainty Estimation
12 pages
Dissertation CathyWesthues Revised
No ratings yet
Dissertation CathyWesthues Revised
239 pages
Workflowsim: A Toolkit For Simulating Scientific Workflows in Distributed Environments
No ratings yet
Workflowsim: A Toolkit For Simulating Scientific Workflows in Distributed Environments
8 pages
Innovations in Stroke Identification A Machine Learning-Based Diagnostic Model Using Neuroimages
No ratings yet
Innovations in Stroke Identification A Machine Learning-Based Diagnostic Model Using Neuroimages
11 pages
Glossary of Terms Journal of Machine Learning
No ratings yet
Glossary of Terms Journal of Machine Learning
4 pages
Van Gils 2021 Sticky Patches 10.1093-Bioadv-Vbac002
No ratings yet
Van Gils 2021 Sticky Patches 10.1093-Bioadv-Vbac002
8 pages
Fresco
No ratings yet
Fresco
50 pages
AI Chatbot For Tourist Recommendations: A Case Study in Vietnam
No ratings yet
AI Chatbot For Tourist Recommendations: A Case Study in Vietnam
13 pages
AI Based Exercise Prescription System
No ratings yet
AI Based Exercise Prescription System
11 pages
MACHINE LEARNING ALGORITHM - Unit-1-1
100% (1)
MACHINE LEARNING ALGORITHM - Unit-1-1
78 pages
History of Art Paintings Through The Lens of Entropy and Complexity
No ratings yet
History of Art Paintings Through The Lens of Entropy and Complexity
11 pages
231
No ratings yet
231
8 pages
Survey
No ratings yet
Survey
5 pages
Predicting High Resolution Total Phosphorus Concentrations For Soils of The Upper Mississippi River Basin Using Machine Learning
No ratings yet
Predicting High Resolution Total Phosphorus Concentrations For Soils of The Upper Mississippi River Basin Using Machine Learning
22 pages
Template for the International Journal of Computational Linguistics and Chinese Language Processing IJCLCLP
No ratings yet
Template for the International Journal of Computational Linguistics and Chinese Language Processing IJCLCLP
19 pages
Fast Cross-Validation Via Sequential Analysis - Paper
No ratings yet
Fast Cross-Validation Via Sequential Analysis - Paper
5 pages
Unit V
No ratings yet
Unit V
22 pages
Soriano-Disla Et Al. - 2014 - The Performance of Visible, Near-, and Mid-Infrared Reflectance Spectroscopy For Prediction of Soil Physic-Annotated
No ratings yet
Soriano-Disla Et Al. - 2014 - The Performance of Visible, Near-, and Mid-Infrared Reflectance Spectroscopy For Prediction of Soil Physic-Annotated
50 pages
Stacking Paper Discussion Rejoinder
No ratings yet
Stacking Paper Discussion Rejoinder
87 pages
DWM Record
No ratings yet
DWM Record
96 pages
P037
No ratings yet
P037
4 pages
Multimega Parti 3
No ratings yet
Multimega Parti 3
0 pages
Schaffers Et Al 2008 Ecology
No ratings yet
Schaffers Et Al 2008 Ecology
13 pages

Uploaded by

Uploaded by

Module 4

Machine learning (ML)

Every machine learning algorithm has three components:

Working on feature engineering is especially important when working with traditional

Feature Engineering is the process of extracting and organizing the important

● Example : cross validation or k-fold validation

● The caret package in R makes it easy to incorporate cross-validation into your

You might also like