ML 1
ML 1
•“Machine learning enables a machine to automatically learn from data, improve performance from
experiences, and predict things without being explicitly programmed.”
•With the help of sample historical data, which is known as training data, machine learning algorithms
build a mathematical model that helps in making predictions or decisions without being explicitly
programmed. Machine learning brings computer science and statistics together for creating predictive
models. Machine learning constructs or uses the algorithms that learn from historical data. The more we will
provide the information, the higher will be the performance.
•A machine has the ability to learn if it can improve its performance by gaining more data.
How does Machine Learning work
• A Machine Learning system learns from historical data, builds the prediction models, and whenever
it receives new data, predicts the output for it. The accuracy of predicted output depends upon the
amount of data, as the huge amount of data helps to build a better model which predicts the output more
accurately.
Machine Learning
How does Machine Learning work
• Suppose we have a complex problem, where we need to perform some predictions, so instead of writing
a code for it, we just need to feed the data to generic algorithms, and with the help of these algorithms,
machine builds the logic as per the data and predict the output. Machine learning has changed our way of
thinking about the problem.
Features of Machine Learning
The need for machine learning is increasing day by day. The reason behind the need for machine learning
is that it is capable of doing tasks that are too complex for a person to implement directly. As a human,
we have some limitations as we cannot access the huge amount of data manually, so for this, we need
some computer systems and here comes the machine learning to make things easy for us.
We can train machine learning algorithms by providing them the huge amount of data and let them
explore the data, construct the models, and predict the required output automatically. The performance of
the machine learning algorithm depends on the amount of data, and it can be determined by the cost
function. With the help of machine learning, we can save both time and money.
Need for Machine Learning
The importance of machine learning can be easily understood by its uses cases, currently,
machine learning is used in self-driving cars, cyber fraud detection, face recognition, and friend
suggestion by Facebook, etc. Various top companies such as Netflix and Amazon have built
machine learning models that are using a vast amount of data to analyze the user interest and
recommend product accordingly.
Applications of Machine Learning
Web search: ranking page based on what you are most likely to click on.
Finance: decide who to send what credit card offers to. Evaluation of risk on credit offers. How to decide
where to invest money.
E-commerce: Predicting customer churn. Whether or not a transaction is fraudulent
Space exploration: space probes and radio astronomy.
Applications of Machine Learning
Prediction — Machine learning can also be used in the prediction systems. Considering the loan
example, to compute the probability of a fault, the system will need to classify the available data in
groups.
Image recognition — Machine learning can be used for face detection in an image as well. There is a
separate category for each person in a database of several people.
Speech Recognition — It is the translation of spoken words into the text. It is used in voice searches and
more. Voice user interfaces include voice dialing, call routing, and appliance control. It can also be used a
simple data entry and the preparation of structured documents.
Medical diagnoses — ML is trained to recognize cancerous tissues.
Financial industry and trading — companies use ML in fraud investigations and credit checks.
Applications of Machine Learning
But there are much more examples of ML in use
Machine learning Life cycle
Machine learning Life cycle
1. Gathering Data:
Data Gathering is the first step of the machine learning life cycle. The goal of this step is to identify and
obtain all data-related problems.
In this step, we need to identify the different data sources, as data can be collected from various sources such
as files, database, internet, or mobile devices. It is one of the most important steps of the life cycle. The
quantity and quality of the collected data will determine the efficiency of the output. The more will be the
data, the more accurate will be the prediction.
This step includes the below tasks:
• Identify various data sources
• Collect data
• Integrate the data obtained from different sources
By performing the above task, we get a coherent set of data, also called as a dataset. It will be used in
further steps.
Machine learning Life cycle
2. Data preparation
After collecting the data, we need to prepare it for further steps. Data preparation is a step where
we put our data into a suitable place and prepare it to use in our machine learning training.
In this step, first, we put all data together, and then randomize the ordering of data.
This step can be further divided into two processes:
Data exploration:
It is used to understand the nature of data that we have to work with. We need to understand the
characteristics, format, and quality of data.
A better understanding of data leads to an effective outcome. In this, we find Correlations, general
trends, and outliers.
Data pre-processing:
Now the next step is preprocessing of data for its analysis.
Machine learning Life cycle
3. Data Wrangling
Data wrangling is the process of cleaning and converting raw data into a useable format. It is the process of
cleaning the data, selecting the variable to use, and transforming the data in a proper format to make it more
suitable for analysis in the next step. It is one of the most important steps of the complete process. Cleaning
of data is required to address the quality issues.
It is not necessary that data we have collected is always of our use as some of the data may not be useful. In
real-world applications, collected data may have various issues, including:
Missing Values
Duplicate data
Invalid data
Noise
So, we use various filtering techniques to clean the data.
It is mandatory to detect and remove the above issues because it can negatively affect the quality of the
outcome.
Machine learning Life cycle
4. Data Analysis
Now the cleaned and prepared data is passed on to the analysis step. This step involves:
Selection of analytical techniques
Building models
Review the result
The aim of this step is to build a machine learning model to analyze the data using various analytical
techniques and review the outcome. It starts with the determination of the type of the problems, where we
select the machine learning techniques such as Classification, Regression, Cluster analysis, Association,
etc. then build the model using prepared data, and evaluate the model.
Hence, in this step, we take the data and use machine learning algorithms to build the model.
Machine learning Life cycle
5. Train Model
Now some percentage of the cleaned and prepared data will be trained using the model.
Eg. Apple and banana are classified according to their features.
6. Test Model
New data will be passed thru the trained model to predict the output and accuracy will be calculated.
Machine learning Life cycle
7. Deployment
The last step of machine learning life cycle is deployment, where we deploy the model in the real-world
system.
If the above-prepared model is producing an accurate result as per our requirement with acceptable speed,
then we deploy the model in the real system. But before deploying the project, we will check whether it is
improving its performance using available data or not. The deployment phase is similar to making the final
report for a project.
Learning Paradigms in Machine Learning
Learning Paradigms basically states a particular pattern on which something or someone learns.
Learning Paradigms related to machine learning, i.e how a machine learns when some data is given to it, its
pattern of approach for some particular data.
Machine learning is commonly separated into three main learning paradigms:
1. Supervised Learning
2. Unsupervised Learning
3. Reinforcement Learning
Learning Paradigms in Machine Learning
Learning Paradigms in Machine Learning
1) Supervised Learning
• Supervised learning is a type of machine learning method in which we provide sample labeled data to the
machine learning system in order to train it, and on that basis, it predicts the output.
• The system creates a model using labeled data to understand the datasets and learn about each data, once
the training and processing are done then we test the model by providing a sample data to check whether
it is predicting the exact output or not.
• The goal of supervised learning is to map input data with the output data. The supervised learning is
based on supervision, and it is the same as when a student learns things in the supervision of the teacher.
The example of supervised learning is spam filtering.
• Example: Regression and Classification
Learning Paradigms in Machine Learning
• The main goal of the supervised learning technique is to map the input variable(x) with the output
variable(y). Some real-world applications of supervised learning are Risk Assessment, Fraud Detection,
Spam filtering, etc.
• Advantages:
• Model can predict the output on the basis of prior experience.
• We can have the exact idea of the classes.
• It helps us to solve various real world problems such as Fraud detection, Spam filtering etc.
• Disadvantages:
• It is not able to handle complex tasks.
• It cant predict the correct output if the training dataset and test dataset are different.
• Most of the real world problem doesn’t contain labeled dataset.
Learning Paradigms in Machine Learning
2) Unsupervised Learning
• Unsupervised learning is a learning method in which a machine learns without any supervision.
• The training is provided to the machine with the set of data that has not been labeled, classified, or
categorized, and the algorithm needs to act on that data without any supervision. The goal of
unsupervised learning is to restructure the input data into new features or a group of objects with similar
patterns.
• In unsupervised learning, we don't have a predetermined result. The machine tries to find useful insights
from the huge amount of data.
• Example: Clustering and Association Rule Mining
Learning Paradigms in Machine Learning
Advantages of Unsupervised Learning
• Unsupervised learning is used for more complex tasks as compared to supervised learning because, in
unsupervised learning, we don't have labeled input data.
• Unsupervised learning is preferable as it is easy to get unlabeled data in comparison to labeled data.
Disadvantages of Unsupervised Learning
• Unsupervised learning is intrinsically more difficult than supervised learning as it does not have
corresponding output.
• The result of the unsupervised learning algorithm might be less accurate as input data is not labeled, and
algorithms do not know the exact output in advance.
Learning Paradigms in Machine Learning
3) Reinforcement Learning
Reinforcement learning is a feedback-based learning method, in which a learning agent gets a reward for
each right action and gets a penalty for each wrong action. The agent learns automatically with these
feedbacks and improves its performance. In reinforcement learning, the agent interacts with the environment
and explores it. The goal of an agent is to get the most reward points, and hence, it improves its
performance.
The robotic dog, which automatically learns the movement of his arms, is an example of Reinforcement
learning.
Learning Paradigms in Machine Learning
Advantages and Disadvantages of Reinforcement Learning
Advantages
• It helps in solving complex real-world problems which are difficult to be solved by general techniques.
• The learning model of RL is similar to the learning of human beings; hence most accurate results can be
found.
• Helps in achieving long term results.
Disadvantage
• RL algorithms are not preferred for simple problems.
• RL algorithms require huge data and computations.
• Too much reinforcement learning can lead to an overload of states which can weaken the results.
PAC - Learning
• PAC stand for Probably approximately correct
• Probably approximately correct (PAC) learning is a framework for mathematical analysis of
machine learning algorithm.
• In the other words PAC learning is a theoretical framework for analyzing the generalization
performance of machine learning algorithms.
Goal: With High Probability ("Probably"),the select hypothesis will have lower error
("Approximately Correct"")
In the PAC model ,we specify two small parameter ,ε (epsilon) and δ (delta) and require that with
probability at least (1-δ ) a system learn a concept with error at most ε.
PAC - Learning
ε and δ parameters:
ε gives an upper bound on the error in the accuracy with which h approximated(Accuracy : 1-ε
)
δ gives the probability of failure in the achieving this accuracy (Confidence : 1-δ)
PAC - Learning
A good learner will learn with high probability and close approximation
to the target concept
With high probability, the selected hypothesis will have lower the error
("Approximately Correct") with the parameter ε and δ
PAC - Learning
PAC – Learning Example
PAC – Learning Example
r
PAC – Learning Example
PAC – Learning Example
PAC – Learning Example
PAC – Learning Example
>=
>=
PAC – Learning Example
PAC – Learning Example
PAC – Learning Example
PAC – Learning Example
PAC – Learning Example
PAC – Learning Example
PAC – Learning Example
• Hypothesis: A hypothesis in machine learning
is a candidate model that approximates a
target function for mapping inputs to output.
• Hypothesis Space (H): It is the set of all
possible legal hypothesis.
All yellow
rectangle
parallel to x
and y axis
are possible
legal
hypothesis
• Version Space: It consists of all Hypothesis
that are consistent with the set of training
examples.
Hence both the rectangles Most General Hypothesis (G) and Most
Specific Hypothesis (S) both are consistent with the training
examples. So, whatever rectangle we draw between most specific
and most general hypothesis will also be consistent. This
space in between these two rectangles is called ‘Version
Space’.
Version Space
Consistent
Hypothesis
6
Version Space
• The Candidate-Elimination algorithm represents the set of
all hypotheses consistent with the observed training
examples.
• This subset of all hypotheses is called the version space with
respect to the hypothesis space H and the training examples
D, because it contains all possible versions of the target
concept.
• H= hypothesis space Consistent h(x)=c(x)
D= training Example c= target concept
List-Then-Eliminate Algorithm
• List-Then-Eliminate algorithm initializes the version space to contain
all hypotheses in H, then eliminates any hypothesis found
inconsistent with any training example.
• The version space of candidate hypotheses thus shrinks as more
examples are observed, until ideally just one hypothesis remains that
is consistent with all the observed examples.
– Presumably, this is the desired target concept.
– If insufficient data is available to narrow the version space to a single hypothesis, then
the algorithm can output the entire set of hypotheses consistent with the observeddata.
• List-Then-Eliminate algorithm can be applied whenever the
hypothesis space H is finite.
– It has many advantages, including the fact that it is guaranteed to output all
hypotheses consistent with the training data.
– Unfortunately, it requires exhaustively enumerating all hypotheses in H - an
unrealistic requirement for all but the most trivial hypothesis spaces.
List-Then-Eliminate
Algorithm
Example
Problems with List Then
Eliminate Algorithm
Candidate-Elimination Algorithm
• The Candidate-Elimination algorithm computes the version space containing all
hypotheses from H that are consistent with an observed sequence of training
examples.
• It begins by initializing the version space to the set of all hypotheses in H; that is,
by initializing the G boundary set to contain the most general hypothesis in H
G0 { <?, ?, ?, ?, ?, ?> }
and initializing the S boundary set to contain the most specific
hypothesis S0 { <0, 0, 0, 0, 0, 0> }
• These two boundary sets delimit the entire hypothesis space, because every
other hypothesis in H is both more general than S0 and more specific than
G0.
• As each training example is considered, the S and G boundary sets are generalized
and specialized, respectively, to eliminate from the version space any hypotheses
found inconsistent with the new training example.
• After all examples have been processed, the computed version space contains all
the hypotheses consistent with these examples and only these hypotheses.
6
9
Candidate-Elimination Algorithm
• Initialize G to the set of maximally general hypotheses in H
• Initialize S to the set of maximally specific hypotheses in H
• For each training example d, do
– If d is a positive example
• Remove from G any hypothesis inconsistent with d ,
• For each hypothesis s in S that is not consistent with d ,-
– Remove s from S
– Add to S all minimal generalizations h of s such that
» h is consistent with d, and some member of G is more general than h
– Remove from S any hypothesis that is more general than another hypothesis in
S
– If d is a negative example
• Remove from S any hypothesis inconsistent with d
• For each hypothesis g in G that is not consistent with d
– Remove g from G
– Add to G all minimal specializations h of g such that
» h is consistent with d, and some member of S is more specific than h
– Remove from G any hypothesis that is less general than another hypothesis in
G
7
0
Candidate - Elimination
Algorithm
Candidate-Elimination Algorithm
•S0 and G0 are the initial
– Example boundary
1 sets corresponding
to the most specific and most
general hypotheses.
7
2
Candidate-Elimination
Algorithm - Example
7
3
Candidate-Elimination
Algorithm - Example
7
4
Candidate-Elimination Algorithm – Example Final
Version Space
7
6
Example 2
www.paruluniversi
ty.ac.in