0% found this document useful (0 votes)
11 views

Devtern

The document contains multiple choice questions and short answer questions related to machine learning concepts, including supervised and unsupervised learning, precision and recall, and the training-validation-test split. It also includes a coding exercise involving logistic regression using Python and pandas. The answers provided clarify key concepts and demonstrate practical application in machine learning.

Uploaded by

Pavan Barhate
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Devtern

The document contains multiple choice questions and short answer questions related to machine learning concepts, including supervised and unsupervised learning, precision and recall, and the training-validation-test split. It also includes a coding exercise involving logistic regression using Python and pandas. The answers provided clarify key concepts and demonstrate practical application in machine learning.

Uploaded by

Pavan Barhate
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Sec on A: Mul ple Choice Ques ons (MCQs) [20 marks]

1. What is the primary goal of supervised learning?


a) Minimize error in predic ons.

b) Discover hidden pa erns in data.

c) Classify data into predefined categories.

d) Uncover rela onships between variables.

Ans. c) Classify data into predefined categories.

2. Which algorithm is commonly used for regression problems in machine learning?


a) K-Nearest Neighbours.

b) Decision Trees.

c) Support Vector Machines.

d) Linear Regression.

Ans. d) Linear Regression.

3. What is the purpose of the ac va on func on in a neural network?


a) Normalize input data.

b) Introduce non-linearity to the model.

c) Regularize the model.

d) Control the learning rate.

Ans. b) Introduce non-linearity to the model.

4. In unsupervised learning, what is the main objec ve?


a) Minimize predic on error.

b) Classify data into predefined categories.

c) Discover hidden pa erns in data.

d) Train a model with labeled examples.

5. What does the term "overfi ng" refer to in machine learning?


a) Model performs well on training data but poorly on new data.

b) Model performs well on both training and test data.


c) Model is too simple to capture underlying pa erns.

d) Model is perfectly fit to the training data.

Ans. a) Model performs well on training data but poorly on new data.

Sec on B: Short Answer Ques ons [30 marks]


6. Define the terms "precision" and "recall" in the context of classifica on metrics.
Ans. 1) Precision:

 Defini on: Precision, some mes referred to as posi ve predic ve value, is a metric used to
express how well a classifica on model predicts the posi ve outcomes. "Of all the instances
predicted as posi ve, how many are truly posi ve?" is the ques on it answers.
 Formula: The ra o of true posi ve predic ons to the total of true posi ves and false
posi ves is used to compute precision.

 Interpreta on: A high precision value indicates that the model is good at correctly iden fying
posi ve instances and has a low rate of false posi ves.

2) Recall:

 Defini on: Recall, some mes referred to as sensi vity or true posi ve rate, assesses a
model's capacity to include every per nent instance of a posi ve class. "Of all the true
posi ve instances, how many were correctly predicted?" is the ques on it answers.
 Formula: The ra o of true posi ve predic ons to the total of true posi ves and false
nega ves is used to compute recall.

 Interpreta on: A high recall number means that there are few false nega ves in the model,
and most posi ve events are successfully captured by the model.

7. Explain the difference between reinforcement learning and supervised learning.


Ans. A) Reinforcement Learning:

1) Objec ve:
 Defini on: Through behaviour and feedback in the form of rewards or penal es, an agent
learns how to behave in a given environment through reinforcement learning (RL), a sort of
machine learning.
 Objec ve: The agent seeks to discover a policy—a mapping from states to ac ons—that
maximises the total reward over a given period of me.

2) Training Data:
 Supervision: Labelled training data with dis nct input-output pairings are not necessary for
reinforcement learning.
 Explora on-Exploita on: The agent inves gates its surroundings, acts, and gains experience
via trial and error.
3) Feedback:
 Reward Signal: A er every ac on, the agent is provided with a numerical reward signal that
indicates how desirable the ac vity is right away.

4) Examples:
 Applica ons: In situa ons like gaming, robo c control, and autonomous decision-making,
where an agent interacts with a dynamic environment, reinforcement learning is frequently
u lised.

B) Supervised Learning:

1) Objec ve:
 Defini on: A sort of machine learning known as supervised learning uses labelled datasets
made up of input-output pairs to train the model.
 Objec ve: In order to enable the model to produce precise predic ons on fresh, unobserved
data, the objec ve is to develop a mapping from inputs to matching outputs.

2) Training Data:
 Labelled Data: A dataset with known inputs and associated correct outputs for each example
is necessary for supervised learning to func on.
 Training Process: The goal of the model's training process is to reduce the difference
between the actual outputs in the training set and its predic ons.

3) Feedback:
 Error Signal: An error signal, which is calculated as the discrepancy between the model's
predic ons and the actual labels, is the explicit feedback that the model gets.

4) Examples:
 Applica ons: In tasks like speech recogni on, image classifica on, and natural language
processing, supervised learning is frequently employed. In these tasks, the model is trained
on labelled data in order to produce precise predic ons on fresh, unseen data.

Here are the key differences:

1. Training Data:
 Reinforcement Learning: No labelled training data; learns from interac ng with the
environment.
 Supervised Learning: Relies on labelled training data with explicit input-output pairs.

2. Objec ve:
 Reinforcement Learning: Learns a policy to maximize cumula ve reward through trial and
error.
 Supervised Learning: Learns a mapping from inputs to outputs to make accurate predic ons.

3. Feedback:
 Reinforcement Learning: Receives a reward signal indica ng the immediate desirability of
ac ons.
 Supervised Learning: Receives explicit error signals based on the difference between
predic ons and true labels.
4. Applica ons:
 Reinforcement Learning: Applied in dynamic environments with interac ve agents.
 Supervised Learning: Commonly used in tasks with labelled datasets, such as classifica on
and regression.

8. What is the purpose of the training-valida on-test split in machine learning?


Ans. In machine learning, the division of training, valida on, and tes ng is an essen al phase in
the crea on of a model. To begin, the available dataset must be divided into three separate sets:
a test set, a valida on set, and a training set. For the purpose of training and assessing machine
learning models, each set has a dis nct func on.

1. Training Set:
 Purpose: The machine learning model is trained using the training set. In order to reduce the
discrepancy between its predic ons and the actual results, the model's parameters are
adjusted as it learns pa erns and rela onships in the data during training.
 Size: The training set is typically the largest among the three sets, as a larger dataset provides
more informa on for the model to learn from.

2. Valida on Set:
 Purpose: During the training phase, the valida on set is used to adjust the model's
hyperparameters and offer an objec ve assessment of a model fit. When a model performs
well on training data but is unable to generalise to new, unknown data, it helps prevent
overfi ng.
 Hyperparameter Tuning: Based on performance on the valida on set, the model's
hyperparameters—configura on se ngs not learnt from the data, such as learning rate—are
changed.
 Size: Though smaller than the training set, the valida on set nevertheless serves as a good
representa on of the whole dataset.

3. Tata Set:
 Purpose: The test set is set aside for the last assessment of the performance of the trained
model. It offers an objec ve evalua on of the model's ability to generalise to fresh, untested
data that it has not come into contact with during valida on or training.
 Unseen Data Evalua on: The test set provides an es mate of the model's performance on
actual data, indica ng how well it should func on in a real-world se ng.
 Size: The test set is separate from the training and valida on sets, and it ought to be big
enough to offer a trustworthy assessment of the model's capacity for generalisa on.

9. Provide an example of a real-world applica on where clustering is used.


Ans. Real world Applica on: Let's take a look at an online retailer that offers a range of goods.
With a sizable customer base, the business hopes to enhance consumer engagement and
improve sales through focused marke ng ini a ves. Through the use of clustering techniques to
historical customer contact and purchase data, the business is able to discern unique and
comparable client groupings.

Clustering Process:
1. Data Collec on: Compile informa on about consumer interac ons, including past purchases,
browsing pa erns, amount of me spent on the website, and the kinds of goods that have
been seen or bought.
2. Feature Selec on: Select per nent features, such as purchase frequency, average order
value, and interest in par cular product categories, that accurately reflect the essen al
elements of client behaviour.
3. Clustering Algorithm: Group clients according to the chosen features by using a clustering
technique, such as DBSCAN, k-means clustering, or hierarchical clustering. The algorithm will
find groups of clients with comparable behaviour.
4. Interpreta on of Clusters: Examine the ensuing clusters to learn about the traits of each
category. Customers who mostly make purchases during sales events may be in one cluster,
while regular high-value shoppers may be in another.
5. Marke ng Strategy: Develop marke ng plans specific to each cluster. For example, the
business may send targeted promo ons or product recommenda ons based on customer
preferences to customers in other clusters, while offering loyalty awards or customised
discounts to high-value clients.

Sec on C: Coding Exercise [50 marks]


10. Programming with Python
You are given a dataset (data.csv) containing two columns: "Feature" and "Label." Using the
pandas library, load the dataset, split it into training and tes ng sets (80% training, 20% tes ng),
and train a logis c regression model to predict the labels based on the features. Finally, evaluate
the model's accuracy on the test set.

Ans.
Hence Model Accuracy is 98.5%

You might also like