0% found this document useful (0 votes)

15 views

Machine Learning Notes (1)

The document provides a comprehensive overview of machine learning, covering its definition, history, and the necessity for ML in handling complex data. It details various types of machine learning, including supervised, unsupervised, and reinforcement learning, along with their applications and algorithms. Additionally, it discusses key concepts such as dimensionality reduction, model evaluation, and the importance of data preprocessing in the machine learning pipeline.

Uploaded by

ajitpmbxr2000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views

Machine Learning Notes (1)

Uploaded by

ajitpmbxr2000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

1

UNIT-I: Machine Learning:

Machine Learning: Definition

Machine Learning (ML) is a branch of artificial intelligence that allows systems to

automatically learn and improve from experience without being explicitly programmed. It

involves algorithms that can analyze data, identify patterns, and make decisions or

predictions. The goal of ML is to create systems that can adapt and evolve as they are

exposed to new data.

History
The history of machine learning dates back to the mid-20th century. In 1959, Arthur

Samuel coined the term "machine learning" while working on a program that could play

checkers. Over the decades, machine learning has evolved from basic algorithms to

complex deep learning models, thanks to advances in computing power, big data, and

mathematical theory. In recent years, it has become a core part of many modern

technologies, including search engines, recommendation systems, and self-driving cars.

Need for Machine Learning

Machine learning is essential because traditional programming cannot handle the

growing volume, variety, and complexity of data. It is especially useful in areas where

designing explicit rules is too difficult or impossible. ML helps in automating tasks,

improving accuracy, enabling personalization, and finding hidden patterns in large

datasets.
2

Features of Machine Learning

Key features of machine learning include data-driven decision-making, the ability to

learn from past experiences, automation of analytical model building, and continuous

improvement over time. It relies on statistical methods and is capable of handling both

structured and unstructured data.

Classification of Machine Learning

Machine learning can be classified into three main types:

● Supervised Learning: In this type, the model is trained on labeled data. The
algorithm learns to map input data to known output labels. Examples include
classification and regression tasks.
● Unsupervised Learning: Here, the model is trained on unlabeled data and
attempts to find hidden patterns or structures. Common techniques include
clustering and dimensionality reduction.
● Reinforcement Learning: This is a type of learning where an agent interacts with
an environment and learns to take actions to maximize a reward signal. It is
commonly used in robotics and game-playing AI.

Machine Learning Life Cycle

The machine learning life cycle consists of several stages: understanding the problem,

collecting and preparing the data, selecting and training the model, evaluating

performance, tuning hyperparameters, deploying the model, and monitoring its

performance in the real world. This cycle is iterative and involves continuous feedback.

Applications of Machine Learning

ML is widely used across industries. Some common applications include spam

detection in emails, product recommendations on e-commerce platforms, facial

recognition, medical diagnosis, financial forecasting, language translation, and

autonomous vehicles. It has transformed how businesses operate and how decisions

are made.

Parametric vs. Non-Parametric Models

Parametric models make assumptions about the data and use a fixed number of

parameters (e.g., linear regression). These models are generally simpler and faster to

train. Non-parametric models, like decision trees or k-nearest neighbors, do not assume

a fixed form and can grow more complex with data. They are more flexible but often

require more data to perform well.

Learning Theory – Bias/Variance Tradeoff

The bias-variance tradeoff is a fundamental concept in ML. Bias refers to errors due to

overly simplistic models that fail to capture the underlying patterns (underfitting), while

variance refers to models that are too complex and sensitive to training data

(overfitting). A good model balances bias and variance to generalize well on unseen

data.

Underfitting
Underfitting occurs when a model is too simple to learn the underlying structure of the

data. It leads to poor performance on both training and test datasets. This can happen

due to insufficient training, overly simplistic algorithms, or lack of relevant features.

Overfitting
Overfitting happens when a model learns the training data too well, including its noise

and outliers, and performs poorly on new data. It occurs when the model is too complex
4

or is trained for too long without proper regularization. Techniques like pruning,

regularization, and using more data can help prevent overfitting.

Major Differences Between Statistical Modelling and

Machine Learning
While both approaches deal with data, statistical modeling is more focused on

inference—understanding relationships between variables and drawing conclusions.

Machine learning emphasizes prediction and performance on new, unseen data.

Statistical models are often interpretable, whereas ML models, especially deep learning,

may act like “black boxes.”

Steps in Machine Learning Model Development

Developing a machine learning model involves several steps:

1. Define the problem and goals.

2. Collect and preprocess data.
3. Choose an appropriate algorithm.
4. Train the model using training data.
5. Validate and tune the model using validation data.
6. Evaluate the model with test data.
7. Deploy the model and monitor performance.

Machine Learning Losses

Loss functions measure how well a model’s predictions match actual outcomes.
Common loss functions include Mean Squared Error (MSE) for regression and
Cross-Entropy Loss for classification. The choice of loss function affects how the model
learns and is optimized.
5

When to Stop Tuning Machine Learning Models

Tuning should stop when performance on the validation set no longer improves, or

when improvements are minimal and not worth the added complexity. Over-tuning can

lead to overfitting. Using techniques like early stopping can help automate this process.

Train, Validation, and Test Data

● Training data is used to teach the model.

● Validation data is used to fine-tune the model and select hyperparameters.
● Test data is used to assess the model’s performance on unseen data. Proper
separation of these sets ensures reliable and unbiased evaluation.

Cross-Validation
Cross-validation is a technique used to assess the performance of a model more

reliably by dividing data into multiple subsets (folds). One popular method is k-fold

cross-validation, where the data is split into k parts and the model is trained and tested

k times, each time with a different fold used as the test set. It helps reduce variability

and ensures robust evaluation.

Grid Search
Grid Search is a systematic way to find the best combination of hyperparameters for a

machine learning model. It involves specifying a set of possible values for each

hyperparameter and training models for every combination. Though time-consuming, it

is useful for optimizing model performance.

UNIT-II: Dimensionality Reduction and Data Representation in Machine Learning:

Dimensionality Reduction: Definition

Dimensionality reduction is a process used in machine learning to reduce the number of input
variables or features in a dataset. It helps in simplifying models, reducing computation time, and
removing noise or redundancy in data. By reducing dimensions, the data becomes easier to
visualize and interpret while retaining as much relevant information as possible. This technique
is especially useful when dealing with high-dimensional datasets, which can suffer from the
"curse of dimensionality."

Row Vector and Column Vector

In the context of linear algebra and machine learning, a row vector is a 1 × n matrix (a single
row with multiple columns), and a column vector is an n × 1 matrix (a single column with
multiple rows). Each vector can represent a set of features or data points. For example, in a
dataset, a row vector can represent one observation across multiple features, while a column
vector can represent one feature across all observations.

How to Represent a Dataset

A dataset in machine learning is typically a collection of samples or observations, each

described by a set of features. It can be thought of as a table where rows correspond to different
samples (like different people or transactions), and columns correspond to features (like age,
income, etc.). Each cell in the table contains a value of a feature for a particular sample.

How to Represent a Dataset as a Matrix

A dataset can be represented as a data matrix, where rows represent individual samples (data
points), and columns represent features. If there are m samples and n features, the dataset
becomes an m × n matrix. This matrix form is useful because many machine learning
algorithms, especially those involving linear algebra (like PCA), are designed to operate on
matrices.
7

Data Preprocessing in Machine Learning

Data preprocessing is a crucial step in the ML pipeline where raw data is cleaned and
transformed into a usable format. It includes tasks such as handling missing values, encoding
categorical variables, and normalizing numerical values. Good preprocessing ensures that the
data fed into the model is accurate, consistent, and suitable for learning.

Feature Normalization

Feature normalization is the process of scaling the values of features so they fall within a
specific range (typically 0 to 1 or -1 to 1). This is important because different features may have
different scales, and unnormalized data can bias machine learning algorithms. Normalization
ensures that each feature contributes equally to the learning process, especially in algorithms
that use distance metrics.

Mean of a Data Matrix

The mean of a data matrix is calculated feature-wise—by averaging each column. This gives a
vector containing the average value of each feature across all samples. Subtracting the mean
from each element (centering the data) is a standard step in many preprocessing techniques,
such as PCA, to ensure that the data has zero mean.

Column Standardization

Column standardization (also called z-score normalization) transforms each feature in the
dataset so that it has a mean of 0 and a standard deviation of 1. This is done by subtracting the
column mean from each element and dividing by the column’s standard deviation.
Standardization is especially important for algorithms like PCA and k-means, which are sensitive
to the scale of features.

Co-variance of a Data Matrix

The covariance matrix captures how much the features vary with respect to each other. For an
n-dimensional dataset, the covariance matrix is an n × n matrix, where each element (i, j)
indicates the covariance between feature i and feature j. A high positive value means the
features increase together, a negative value means they vary inversely, and zero means no
linear relationship.
8

Principal Component Analysis (PCA) for Dimensionality Reduction

PCA is one of the most widely used techniques for dimensionality reduction. It transforms the
original features into a new set of uncorrelated features called principal components, ordered
by the amount of variance they explain. PCA identifies the directions (components) in which the
data varies the most and projects the data onto those directions. By keeping only the top few
principal components, we can reduce the dimensionality of the data while retaining most of its
important information. PCA helps in visualization, noise reduction, and improving model
efficiency.

UNIT-III: Supervised Learning in Machine Learning:

Supervised Learning: Definition

Supervised learning is a type of machine learning where the algorithm is trained on a labeled
dataset. This means that each training example is paired with an output label. The goal of
supervised learning is to learn a mapping from inputs to outputs so that the model can predict
the output for new, unseen data. It is called "supervised" because the learning process is guided
by the correct answers provided during training.

How Supervised Learning Works

Supervised learning begins with a dataset that includes both input features (independent
variables) and corresponding output labels (dependent variables). The algorithm analyzes the
training data and learns a function that maps the inputs to the correct outputs. This function is
then used to predict outputs for new inputs. The accuracy of the predictions is evaluated using
performance metrics such as accuracy, precision, recall, or mean squared error, depending on
the problem type (classification or regression). The model improves over time through
optimization techniques that minimize the difference between predicted and actual values.
9

Types of Supervised Learning Algorithms

Supervised learning can be broadly categorized into two types based on the type of output:

● Classification: When the output variable is categorical (e.g., spam or not spam).

● Regression: When the output variable is continuous (e.g., predicting house prices).

Several popular supervised learning algorithms fall under these categories:

k-Nearest Neighbours (k-NN)

k-NN is a simple and intuitive classification algorithm. It classifies a new data point based on the
majority label among its k closest neighbors in the training data. The closeness is usually
measured using distance metrics like Euclidean distance. It is non-parametric and lazy, meaning
it doesn’t learn a model during training but rather memorizes the training data and classifies only
at prediction time. While k-NN is easy to implement, it can be slow with large datasets.

Naïve Bayes

Naïve Bayes is a probabilistic classifier based on Bayes’ Theorem. It assumes that all features
are independent of each other given the class label, which is rarely true in practice but works
surprisingly well. It calculates the probability of each class given the input features and selects
the class with the highest probability. It is especially effective for text classification problems like
spam detection and sentiment analysis due to its simplicity and speed.

Decision Trees

Decision Trees are tree-like models where each internal node represents a decision on a
feature, each branch represents an outcome of that decision, and each leaf node represents a
final class label or value. They work by recursively splitting the data based on feature values to
maximize information gain or reduce impurity (e.g., using Gini index or entropy). Decision Trees
are easy to interpret and visualize, but they can overfit the data if not properly pruned.
10

Linear Regression

Linear Regression is a regression algorithm used to model the relationship between a

dependent variable and one or more independent variables by fitting a straight line (in the form y
= mx + c). It assumes a linear relationship between the variables. It is simple and effective for
continuous outcome prediction but does not work well when the relationship is non-linear or
when there are many correlated features.

Logistic Regression

Logistic Regression is a classification algorithm used to predict binary or multi-class outcomes.

It estimates the probability that a given input belongs to a particular class using the logistic
(sigmoid) function. Despite the name, it is used for classification tasks. It is widely used for
binary classification problems like predicting whether a customer will buy a product or not.

Support Vector Machines (SVM)

SVM is a powerful classification algorithm that finds the best boundary (called the hyperplane)
that separates data points of different classes. It tries to maximize the margin between the two
classes. SVMs are effective in high-dimensional spaces and work well when there is a clear
margin of separation between classes. They can also handle non-linear classification using
kernel tricks, which map the input data into higher dimensions.

UNIT-IV: Unsupervised Learning, Ensemble Methods, Dimensionality Reduction, and

Model Evaluation in Machine Learning:

Unsupervised Learning: Definition

Unsupervised learning is a type of machine learning where the model learns from data that has
no labels. The algorithm tries to find hidden patterns or structures within the data on its own. It is
commonly used for clustering, anomaly detection, and dimensionality reduction. The key goal is
to group similar data points or reduce the complexity of data while preserving important
relationships.
11

Clustering: K-means

K-means is a popular unsupervised learning algorithm used for clustering data into k groups. It
works by randomly selecting k centroids (initial cluster centers), assigning each data point to the
nearest centroid, and then updating the centroids based on the average position of the assigned
points. This process is repeated until the centroids stabilize. K-means is simple and efficient but
can be sensitive to the initial selection of centroids and does not work well with non-spherical
clusters.

Ensemble Methods

Ensemble methods combine multiple models to improve overall performance. The idea is that a
group of weak learners can come together to form a strong learner. Common ensemble
methods include:

● Boosting: Boosting builds models sequentially, where each new model tries to correct
the errors made by the previous ones. Algorithms like AdaBoost and Gradient Boosting
are examples. Boosting focuses on difficult examples and often results in high accuracy
but may overfit if not regularized properly.

● Bagging: Bagging (Bootstrap Aggregating) trains multiple models independently on

different random subsets of the training data (with replacement). Their outputs are then
combined, often by averaging or majority voting. Random Forests are a common
bagging technique. Bagging reduces variance and helps avoid overfitting.

● Random Forests: A Random Forest is an ensemble of decision trees, where each tree
is trained on a different subset of the data and a random subset of features. The final
prediction is made by aggregating the outputs of all trees (e.g., majority vote for
classification). Random Forests are robust, accurate, and can handle missing data well.

Dimensionality Reduction Techniques

Apart from PCA, other important dimensionality reduction techniques include:

● Principal Component Analysis (PCA): PCA transforms data into new coordinates that
maximize variance, selecting top components to reduce dimensions while preserving
most of the information.

● Linear Discriminant Analysis (LDA): LDA is used primarily for classification tasks.
Unlike PCA, which is unsupervised, LDA is supervised and tries to find feature
12

combinations that best separate different classes.

● Independent Component Analysis (ICA): ICA separates a multivariate signal into

additive, statistically independent components. It is used in applications like blind source
separation (e.g., separating audio signals from a mixed recording).

● Singular Value Decomposition (SVD): SVD decomposes a matrix into three other
matrices and is widely used in dimensionality reduction, especially in text processing and
recommendation systems. It helps in reducing noise and compressing data.

Evaluation: Performance Measurement of Models

Accuracy

Accuracy measures the proportion of correct predictions out of total predictions. It is simple and
intuitive but may be misleading in imbalanced datasets (where one class dominates).

Confusion Matrix

A confusion matrix is a table that shows the number of true positives (TP), true negatives (TN),
false positives (FP), and false negatives (FN). It provides a detailed breakdown of model
performance beyond just accuracy.

Precision and Recall

● Precision = TP / (TP + FP): It measures how many predicted positives are actually
correct.

● Recall = TP / (TP + FN): It measures how many actual positives were correctly
predicted.
Precision is useful when false positives are costly; recall is critical when missing
positives is more dangerous.

F1-score

The F1-score is the harmonic mean of precision and recall. It balances the two and is especially
useful when there is an uneven class distribution.
F1 = 2 * (Precision * Recall) / (Precision + Recall)

ROC Curve and AUC

● The Receiver Operating Characteristic (ROC) curve plots the true positive rate (recall)
against the false positive rate.

● AUC (Area Under the Curve) measures the area under the ROC curve. A model with an
AUC close to 1 performs well, while 0.5 indicates random guessing.

Median Absolute Deviation (MAD)

MAD is a robust measure of variability. It is the median of the absolute differences between
each data point and the median of the dataset. Unlike standard deviation, MAD is less affected
by outliers, making it valuable in robust regression and error analysis.

Distribution of Errors

The distribution of errors refers to the analysis of how model prediction errors (difference
between predicted and actual values) are spread. Ideally, errors should be randomly distributed
with a mean close to zero. Analyzing the distribution helps identify patterns such as underfitting,
overfitting, or bias in the model.

UNIT-I: Introduction to Machine Learning

Machine Learning Definition (Example)

A spam filter in your email inbox learns from thousands of labeled emails ("spam" or "not spam")
to automatically classify new emails.

History (Example)

In 1959, Arthur Samuel developed a program that played checkers and improved by playing
against itself – one of the first ML applications.

Need (Example)

Netflix uses ML to recommend shows based on what you've already watched, improving user
experience and engagement.

Features of ML (Example)

Self-learning: Google Translate improves over time by learning from translations worldwide.
14

Classification of ML

● Supervised Learning: Predicting house prices using labeled data with features like size,
location, and past prices.

● Unsupervised Learning: Grouping customers into clusters based on their shopping

behavior without predefined labels.

● Reinforcement Learning: A robot learning to walk by trial and error, receiving rewards
for each successful step.

ML Lifecycle (Example)

Building a voice assistant: data collection → preprocessing → model training → evaluation →

deployment → monitoring.

Applications (Examples)

● Healthcare: Predicting diseases

● E-commerce: Recommending products

● Agriculture: Forecasting crop yields

● Finance: Detecting fraud

Parametric vs. Non-parametric Models (Example)

● Parametric: Linear regression with a fixed number of parameters (line equation).

● Non-parametric: k-NN, which adapts its complexity to the dataset and doesn’t assume a
specific function form.

Bias-Variance Tradeoff (Example)

● High bias (Underfitting): Predicting all house prices as ₹50L – too simple.

● High variance (Overfitting): Memorizing the exact prices for training data but failing on
new data.
15

Statistical Modeling vs. ML (Example)

● Statistical: Assumes a model form like normal distribution.

● ML: Focuses more on prediction accuracy with fewer assumptions.

Steps in ML Model Development (Example)

1. Define problem – predict sales

2. Collect data – past sales data

3. Preprocess – clean and normalize

4. Train – use algorithm

5. Test – evaluate accuracy

6. Deploy – put into real use

Loss Functions (Example)

● MSE (Mean Squared Error): Measures how far off predictions are in regression
problems.

When to Stop Tuning (Example)

When validation error stops improving or starts increasing, indicating overfitting.

Train, Validation, Test (Example)

In a face recognition app:

● Train set – teaches the model with known faces

● Validation set – tunes hyperparameters

● Test set – evaluates on unseen faces

Cross-validation (Example)

K-fold cross-validation: Divides data into 5 parts, trains on 4 and tests on 1, repeating 5 times
for better accuracy estimate.

Grid Search (Example)

Testing various combinations of learning rates and tree depths in a decision tree to find the best
performing one.

✅ UNIT-II:
Row & Column Vector (Example)

● Row vector: [20, 5.6, 60] → 1 student’s age, height, weight

● Column vector:

CopyEdit
20
5.6
60

Representing Dataset as Matrix (Example)

Each row = one person, each column = age, income, height

css
CopyEdit
[ 25, 30000, 5.9 ]
[ 32, 45000, 6.1 ]

Feature Normalization (Example)

Income: ₹10K to ₹1L → Normalize to 0–1 range so it doesn't overpower smaller features like
age.
17

Mean of Matrix (Example)

Average height across all people (column average).

Column Standardization (Example)

Standardize test scores so each subject (column) has mean 0 and std deviation 1 before
applying algorithms.

Covariance Matrix (Example)

If height and weight have high positive covariance, taller people tend to weigh more.

Principal Component Analysis (PCA) (Example)

In a dataset of 10 features, PCA might reduce it to 2 components that explain 95% of the data,
making it easier to visualize.

✅ UNIT-III: Supervised Learning Algorithms

k-NN (Example)

To classify a new fruit as an apple or orange, k-NN looks at its 3 nearest neighbors (based on
size, color) and picks the majority label.

Naïve Bayes (Example)

In spam filtering, if an email has the words "free", "win", and "lottery", Naïve Bayes uses the
probabilities of each word being in spam to predict.

Decision Trees (Example)

For credit approval:

● Is income > ₹40K?

○ Yes → Has existing loan?

■ No → Approve

■ Yes → Decline
18

Linear Regression (Example)

Predict house price based on size:

Price = 5000 × Size + 2,00,000

Logistic Regression (Example)

Predict if a student will pass (yes/no) based on study hours using sigmoid function to output
probabilities.

Support Vector Machines (SVM) (Example)

Classifying emails as spam or not by finding the best dividing line (hyperplane) that separates
the two classes with maximum margin.

✅ UNIT-IV:
K-Means Clustering (Example)

Customer segmentation: Grouping people into clusters based on their spending habits without
any labels.

Boosting (Example)

A model first predicts wrongly that a person won't default on a loan. The next model focuses
more on this error, improving the final combined output.

Bagging (Example)

Random Forest creates different decision trees using random subsets of training data and
averages the results to reduce overfitting.

Random Forest (Example)

Predicting diabetes risk using multiple decision trees built from different patient samples,
combining their predictions.

PCA (Example)

Reduce a 100-feature image dataset to 10 features while retaining the most important
information.
19

LDA (Example)

Used in face recognition: separates images of different people by finding the directions that best
distinguish classes.

ICA (Example)

Separating mixed audio signals into individual speakers in a recording (blind source separation).

SVD (Example)

Used in recommender systems like Netflix: reduces movie rating matrix to uncover patterns in
user preferences.

Evaluation Metrics (Examples)

● Accuracy: 90 out of 100 predictions were correct → 90%

Confusion Matrix:

yaml
CopyEdit
TP: 50, FP: 5
FN: 10, TN: 35

● Precision: 50 / (50 + 5) = 0.91

● Recall: 50 / (50 + 10) = 0.83
● F1-score: 2 * (0.91 * 0.83) / (0.91 + 0.83) ≈ 0.87
● ROC Curve (Example): A graph to visualize how well the model distinguishes between
classes.
● AUC: Closer to 1 is better (e.g., AUC = 0.98 is excellent).
● MAD (Example): Measures median of errors in house price prediction; robust to outliers.
● Distribution of Errors (Example): A histogram showing how far predictions are from
actual values – ideally symmetric and centered at zero.

Machine Learning?
100% (2)
Machine Learning?
114 pages
JNTUK R20 ML UNIT-I (Chapter-I)
No ratings yet
JNTUK R20 ML UNIT-I (Chapter-I)
9 pages
machineLearning-unit1
No ratings yet
machineLearning-unit1
9 pages
Karthik
No ratings yet
Karthik
10 pages
presenttion33
No ratings yet
presenttion33
2 pages
ML - Module 1
No ratings yet
ML - Module 1
30 pages
Advance ML - Unit 1
No ratings yet
Advance ML - Unit 1
12 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
8 pages
Module_-1
No ratings yet
Module_-1
9 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
13 pages
Machine Learning
No ratings yet
Machine Learning
24 pages
Introduction To Machine Learning PPT Main
No ratings yet
Introduction To Machine Learning PPT Main
15 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
ML Unit 1
No ratings yet
ML Unit 1
20 pages
Mastering Machine Learning: A Comprehensive Guide to Success
From Everand
Mastering Machine Learning: A Comprehensive Guide to Success
Rick Spair
No ratings yet
Chapter 01 machine learning
No ratings yet
Chapter 01 machine learning
22 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
48 pages
Machine Learning.
No ratings yet
Machine Learning.
50 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
9 pages
Lecture 1 Machine Learning
No ratings yet
Lecture 1 Machine Learning
22 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
9 pages
ML Notes-1
No ratings yet
ML Notes-1
59 pages
Introduction To Machine Learning: Dr.S.Sankar Ganesh Vellore Institute of Technology
No ratings yet
Introduction To Machine Learning: Dr.S.Sankar Ganesh Vellore Institute of Technology
132 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
6 pages
Chap-6 Machine Learning Introduction
No ratings yet
Chap-6 Machine Learning Introduction
49 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
19 pages
CS601_Machine Learning_Unit 1_Notes_1672759748
No ratings yet
CS601_Machine Learning_Unit 1_Notes_1672759748
13 pages
Unit III
No ratings yet
Unit III
19 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
12 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
AIYA SESSION 4
No ratings yet
AIYA SESSION 4
42 pages
Machine Learning
No ratings yet
Machine Learning
29 pages
Unit 3 - DS - 1st year
No ratings yet
Unit 3 - DS - 1st year
5 pages
Unit 1 Machine Learning - PDF Lands
No ratings yet
Unit 1 Machine Learning - PDF Lands
5 pages
Introduction To Machine Learning Notes
No ratings yet
Introduction To Machine Learning Notes
26 pages
Unit-I
No ratings yet
Unit-I
23 pages
Machine: Learning ATO Z - I
No ratings yet
Machine: Learning ATO Z - I
131 pages
Machine Learning
No ratings yet
Machine Learning
25 pages
ML_7th_Sem_AIML_ITE_Notes_Complete_LONG[1]-10-33
No ratings yet
ML_7th_Sem_AIML_ITE_Notes_Complete_LONG[1]-10-33
24 pages
JNTUK R20 ML UNIT-I Final
No ratings yet
JNTUK R20 ML UNIT-I Final
22 pages
AI Unit 1
No ratings yet
AI Unit 1
30 pages
An Enlightenment To Machine Learning
100% (1)
An Enlightenment To Machine Learning
16 pages
PT2
No ratings yet
PT2
22 pages
CBSYLLABUS BDA 1
No ratings yet
CBSYLLABUS BDA 1
4 pages
UNIT 1 Notes
No ratings yet
UNIT 1 Notes
13 pages
ARTIFICIAL INTELLIGENCE LEC 1 PDF
No ratings yet
ARTIFICIAL INTELLIGENCE LEC 1 PDF
15 pages
ML Intro Theory
No ratings yet
ML Intro Theory
10 pages
Machinelearning Unit-1
No ratings yet
Machinelearning Unit-1
29 pages
ML Notes
No ratings yet
ML Notes
52 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
77 pages
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
ML Notes
No ratings yet
ML Notes
7 pages
DSF - UNIT III Notes
No ratings yet
DSF - UNIT III Notes
17 pages
Intro_DL_01
No ratings yet
Intro_DL_01
64 pages
Module 1 ML
No ratings yet
Module 1 ML
51 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
10 pages
ML_MDU_2024_10939237
No ratings yet
ML_MDU_2024_10939237
20 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
01 - Introduction
No ratings yet
01 - Introduction
35 pages
Unit - 3 - ML
No ratings yet
Unit - 3 - ML
53 pages
LDM1 Module 4 Readiness Checklist For Learners, Teachers and Parents
No ratings yet
LDM1 Module 4 Readiness Checklist For Learners, Teachers and Parents
3 pages
Assessing Listening
No ratings yet
Assessing Listening
23 pages
Math1 Q1 Week4 Day4
No ratings yet
Math1 Q1 Week4 Day4
10 pages
The Influence of Online Applications and Social Media On The Learning Behavior of Grade 10 Students at Ocampo National High School S.Y. 2021 2022
100% (2)
The Influence of Online Applications and Social Media On The Learning Behavior of Grade 10 Students at Ocampo National High School S.Y. 2021 2022
30 pages
Matrix 3 PDF
No ratings yet
Matrix 3 PDF
5 pages
Iste Nakama Julian
No ratings yet
Iste Nakama Julian
38 pages
AI and ML in the Oil and Gas Industry 1731549087
No ratings yet
AI and ML in the Oil and Gas Industry 1731549087
14 pages
Project-Timeline of The Ancient World-Student Guide
No ratings yet
Project-Timeline of The Ancient World-Student Guide
3 pages
Annotated-Lesson 20plan 20template-Updated-1
No ratings yet
Annotated-Lesson 20plan 20template-Updated-1
2 pages
IELTS Speaking Introduction
No ratings yet
IELTS Speaking Introduction
25 pages
Int Programme Matrix
No ratings yet
Int Programme Matrix
28 pages
Propaganda DLP
No ratings yet
Propaganda DLP
7 pages
20020688
No ratings yet
20020688
81 pages
Vignana Mitra
No ratings yet
Vignana Mitra
2 pages
Mathematics Learning Disability
No ratings yet
Mathematics Learning Disability
4 pages
Supporting Young Learners With Dyslexia Pre A1 Starters A1 Movers and A2 Flyers A Guide For Teachers
No ratings yet
Supporting Young Learners With Dyslexia Pre A1 Starters A1 Movers and A2 Flyers A Guide For Teachers
13 pages
RAISEPlus in Filipino 9
No ratings yet
RAISEPlus in Filipino 9
12 pages
ACR Project Real
No ratings yet
ACR Project Real
20 pages
Gec 102 SLM 1
No ratings yet
Gec 102 SLM 1
16 pages
Module in NCBTS
No ratings yet
Module in NCBTS
9 pages
Share It Starter Student Book Capitals Unit 2
0% (1)
Share It Starter Student Book Capitals Unit 2
14 pages
VAM Project PPT
No ratings yet
VAM Project PPT
16 pages
Learning Contract Template-2
No ratings yet
Learning Contract Template-2
2 pages
2 Marks Deep Learning
No ratings yet
2 Marks Deep Learning
4 pages
ITIL Whitepaper
No ratings yet
ITIL Whitepaper
10 pages
Inquiry 5e Lesson Plan Template
No ratings yet
Inquiry 5e Lesson Plan Template
3 pages
Appendix 1: Lesson Plan (Template) : Lesson Plan Subject: English Trainee: Bashayer Abdul-Aziz Topic or Theme: Phonics
No ratings yet
Appendix 1: Lesson Plan (Template) : Lesson Plan Subject: English Trainee: Bashayer Abdul-Aziz Topic or Theme: Phonics
5 pages
Ghotki Matric Result 2012 Sukkur Board
No ratings yet
Ghotki Matric Result 2012 Sukkur Board
25 pages
SKS 4 TB Intro 2021-02-18-10-44-25 12119
No ratings yet
SKS 4 TB Intro 2021-02-18-10-44-25 12119
23 pages
LC H Eduu 675 Signature Assignment Assessment Case Study
No ratings yet
LC H Eduu 675 Signature Assignment Assessment Case Study
4 pages

Uploaded by

Uploaded by

1

UNIT-I: Machine Learning:

Machine Learning: Definition

exposed to new data.

technologies, including search engines, recommendation systems, and self-driving cars.

Need for Machine Learning

designing explicit rules is too difficult or impossible. ML helps in automating tasks,

improving accuracy, enabling personalization, and finding hidden patterns in large

Features of Machine Learning

structured and unstructured data.

Classification of Machine Learning

Machine Learning Life Cycle

performance, tuning hyperparameters, deploying the model, and monitoring its

Applications of Machine Learning

detection in emails, product recommendations on e-commerce platforms, facial

recognition, medical diagnosis, financial forecasting, language translation, and

Parametric vs. Non-Parametric Models

require more data to perform well.

Learning Theory – Bias/Variance Tradeoff

due to insufficient training, overly simplistic algorithms, or lack of relevant features.

regularization, and using more data can help prevent overfitting.

Major Differences Between Statistical Modelling and

inference—understanding relationships between variables and drawing conclusions.

Machine learning emphasizes prediction and performance on new, unseen data.

may act like “black boxes.”

Steps in Machine Learning Model Development

1.​ Define the problem and goals.

Machine Learning Losses

When to Stop Tuning Machine Learning Models

Train, Validation, and Test Data

●​ Training data is used to teach the model.

and ensures robust evaluation.

hyperparameter and training models for every combination. Though time-consuming, it

is useful for optimizing model performance.

UNIT-II: Dimensionality Reduction and Data Representation in Machine Learning:

Dimensionality Reduction: Definition

Row Vector and Column Vector

How to Represent a Dataset

A dataset in machine learning is typically a collection of samples or observations, each

How to Represent a Dataset as a Matrix

Data Preprocessing in Machine Learning

Mean of a Data Matrix

Co-variance of a Data Matrix

Principal Component Analysis (PCA) for Dimensionality Reduction

UNIT-III: Supervised Learning in Machine Learning:

Supervised Learning: Definition

How Supervised Learning Works

Types of Supervised Learning Algorithms

Several popular supervised learning algorithms fall under these categories:

k-Nearest Neighbours (k-NN)

Linear Regression is a regression algorithm used to model the relationship between a

Logistic Regression is a classification algorithm used to predict binary or multi-class outcomes.

Support Vector Machines (SVM)

UNIT-IV: Unsupervised Learning, Ensemble Methods, Dimensionality Reduction, and

Unsupervised Learning: Definition

●​ Bagging: Bagging (Bootstrap Aggregating) trains multiple models independently on

Dimensionality Reduction Techniques

Apart from PCA, other important dimensionality reduction techniques include:

combinations that best separate different classes.​

●​ Independent Component Analysis (ICA): ICA separates a multivariate signal into

Evaluation: Performance Measurement of Models

Precision and Recall

ROC Curve and AUC

Median Absolute Deviation (MAD)

UNIT-I: Introduction to Machine Learning

●​ Unsupervised Learning: Grouping customers into clusters based on their shopping

Building a voice assistant: data collection → preprocessing → model training → evaluation →

●​ Healthcare: Predicting diseases​

●​ E-commerce: Recommending products​

●​ Agriculture: Forecasting crop yields​

●​ Finance: Detecting fraud​

Parametric vs. Non-parametric Models (Example)

●​ Parametric: Linear regression with a fixed number of parameters (line equation).​

Bias-Variance Tradeoff (Example)

Statistical Modeling vs. ML (Example)

●​ Statistical: Assumes a model form like normal distribution.​

●​ ML: Focuses more on prediction accuracy with fewer assumptions.​

Steps in ML Model Development (Example)

1. Define the problem and goals.

● Training data is used to teach the model.

● Bagging: Bagging (Bootstrap Aggregating) trains multiple models independently on

combinations that best separate different classes.

● Independent Component Analysis (ICA): ICA separates a multivariate signal into

● Unsupervised Learning: Grouping customers into clusters based on their shopping

● Healthcare: Predicting diseases

● E-commerce: Recommending products

● Agriculture: Forecasting crop yields

● Finance: Detecting fraud

● Parametric: Linear regression with a fixed number of parameters (line equation).

● Statistical: Assumes a model form like normal distribution.

● ML: Focuses more on prediction accuracy with fewer assumptions.

1. Define problem – predict sales

2. Collect data – past sales data

3. Preprocess – clean and normalize

4. Train – use algorithm

5. Test – evaluate accuracy

6. Deploy – put into real use

● Train set – teaches the model with known faces

● Validation set – tunes hyperparameters

● Test set – evaluates on unseen faces

● Row vector: [20, 5.6, 60] → 1 student’s age, height, weight

● Is income > ₹40K?

○ Yes → Has existing loan?

Predict house price based on size:

● Accuracy: 90 out of 100 predictions were correct → 90%

● Precision: 50 / (50 + 5) = 0.91