22AI2101 CIA II QB With Answer Key
22AI2101 CIA II QB With Answer Key
MACHINE LEARNING
PART A
1. Outline the purpose of Alpha-Beta pruning in game tree search.
Alpha-Beta pruning is used to reduce the number of nodes evaluated in the minimax
algorithm by eliminating branches that cannot possibly influence the final decision,
thus improving computational efficiency.
2. Summarize how Alpha-Beta pruning improves the efficiency of the Minimax
algorithm.
It improves Minimax efficiency by pruning away branches that don't need to be
explored, reducing the time complexity from O(b^d) to O(b^(d/2)) in best cases, where
b is branching factor and d is depth.
3. Identify the main components of a machine learning system.
The main components of a machine learning system are:
o Data: The input examples used for training and testing.
o Model: The algorithm or mathematical representation that learns
patterns from the data to make predictions or decisions.
4. Compare supervised, unsupervised, and reinforcement learning.
Supervised learning uses labeled data to train models (e.g., classification),
unsupervised learning identifies patterns in unlabeled data (e.g., clustering), and
reinforcement learning learns through rewards and punishments from interacting with
an environment (e.g., game playing).
5. Summarize the assumptions of PAC Learning.
PAC (Probably Approximately Correct) Learning assumes that:
The learning algorithm receives independent and identically distributed (i.i.d.)
examples from a fixed but unknown distribution. The hypothesis class contains a
function that can approximate the target concept well.
6. Recall the basic probability laws used in ML.
K-Modes Clustering:
PART B
1. Explain how Alpha-Beta pruning is applied to a game tree and describe how it improves
the efficiency of the Minimax algorithm. Support your explanation with a suitable example
and diagram.
Alpha-Beta pruning application:
o Explain with game tree example
o Show nodes that would be pruned
o Compare with full minimax evaluation
o Diagram showing pruning process
2. Describe step-by-step how Alpha-Beta pruning works and identify which nodes are pruned
in the game tree. Use a clear example and diagram to support your explanation.
Alpha-Beta pruning steps:
o Initialize alpha (-∞) and beta (+∞)
o Depth-first search
o Update alpha at MAX nodes
o Update beta at MIN nodes
o Prune when alpha ≥ beta
o Example with values showing pruned branches
3. Demonstrate the different learning paradigms in Machine Learning with suitable examples.
Learning paradigms:
o Supervised: Classification/Regression (e.g., spam detection)
o Unsupervised: Clustering/Dimensionality reduction (e.g., customer
segmentation)
o Reinforcement: Reward-based learning (e.g., game playing AI)
o Semi-supervised: Mixed labeled/unlabeled data
Illustrate with an example how the Version Space algorithm can be used to update a
hypothesis space during learning.
Version Space algorithm:
4. o General and specific boundary sets
o Candidate elimination algorithm steps
o Example with hypothesis space updating
o Convergence properties
5. Describe in detail the concept of Probably Approximately Correct (PAC) learning. How
does it ensure model reliability?
PAC Learning:
o Formal definition
o Sample complexity
o Computational complexity
o Relationship to VC dimension
o Examples of PAC-learnable classes
6. Explain, with an example, how conditional probability and Bayes’ theorem are used in
model prediction.
Conditional probability & Bayes:
o Bayes' theorem formula
o Prior, likelihood, posterior
o Example: Medical diagnosis
o Naïve Bayes classifier application
7. Explain the differences between linear and non-linear models with appropriate examples.
Compare their advantages and limitations.
Linear vs Non-linear models:
o Linear: Simpler, interpretable (e.g., Linear Regression)
o Non-linear: Complex patterns (e.g., Neural Networks)
o Examples of each type
o Tradeoffs in bias-variance
8. Explain the steps of performing K-Means clustering on a dataset. Discuss how to choose
the number of clusters and interpret the results.
K-Means steps:
o Initialize centroids
o Assign points to nearest centroid
o Recalculate centroids
o Repeat until convergence
o Methods for choosing k (elbow, silhouette)
o Limitations and variants
9. Apply linear regression to a given dataset and interpret the resulting model. Include
derivation of the equation, assumptions, and error analysis.
Linear regression application:
o Model equation derivation
o Assumptions (linearity, independence, etc.)
o Cost function (MSE)
o Gradient descent
o Evaluation metrics (R², RMSE)
10. Use multiple linear regression on a given dataset with two or more independent variables
to derive coefficients and evaluate the model’s performance.
o Model Setup
o Coefficient Derivation
o Performance Metrics
o Critical Assumptions
o Example
11. Apply Minimax and Alpha-Beta pruning to a game tree and compare their efficiency with
examples.
Minimax vs Alpha-Beta:
o Minimax full tree evaluation
o Alpha-Beta pruning process
o Efficiency comparison
o Game tree example showing pruned nodes
12. Demonstrate how adversarial search techniques are used in real-world AI game-playing
systems like chess engines and strategic decision-making, with examples.
Adversarial search applications:
o Chess engine minimax with evaluation
o Real-time strategy game AI
o Negotiation systems
o Security applications
13. Explain supervised, unsupervised, and reinforcement learning, highlighting the advantages
and limitations of each.
Learning paradigms comparison:
o Supervised: Needs labels, predictive accuracy
o Unsupervised: Finds hidden patterns
o Reinforcement: Long-term reward optimization
o Strengths/weaknesses of each
14. Describe the use of probability theory in machine learning and explain the roles of prior,
likelihood, and posterior in learning models.
Probability in ML:
o Bayesian learning framework
o Prior knowledge incorporation
o Likelihood function
o Posterior updating
o Applications in classification
15. Analyze the PAC learning framework and describe its significance in machine learning
model evaluation. Include examples.
PAC Learning analysis:
o Formal definition
o Sample complexity bounds
o Computational tractability
o Relationship to generalization
o Example applications
16. Apply the basic laws of probability (addition, multiplication, conditional) to solve a real-
world ML classification problem. Provide suitable data.
Probability laws application:
o Problem setup (e.g., medical test)
o Addition rule for unions
o Multiplication for intersections
o Bayes' for conditional
o Complete probabilistic model
17. Differentiate between multi-class and multi-label classification problems. Provide real-
world examples and algorithms used.
Multi-class vs Multi-label:
o Definition and examples
o Evaluation metrics differences
o Algorithm adaptations
o Use cases for each
18. Explain how the Naïve Bayes Classifier works and describe how conditional probability
and Bayes’ theorem are used in classifying text documents.
Naïve Bayes classifier:
o Mathematical formulation
o Feature independence assumption
o Training process
o Text classification example
o Smoothing techniques
19. Construct a decision tree manually using a small dataset. Use entropy and information gain
to justify each split.
Decision tree construction:
o Entropy/Information Gain calculation
o Recursive splitting
o Stopping criteria
o Example with small dataset
o Tree visualization
20. Apply the key assumptions and limitations of decision trees to a problem and demonstrate
how overfitting can be reduced using pruning or ensemble methods.
1. Key Assumptions:
Features are treated as conditionally independent
Splits are axis-aligned (perpendicular to feature axes)
Uses greedy, locally optimal splitting (maximizes immediate information gain)
2. Main Limitations:
High tendency to overfit (creates overly complex trees)
Poor generalization with noisy data
Instability (small data changes cause large tree structure changes)
Bias toward features with more levels/higher variance
3. Pruning Solution:
Post-pruning: Grows full tree then removes insignificant branches
Uses validation set to determine optimal tree size
Reduces complexity while maintaining accuracy
Example: Cost-complexity pruning with α parameter tuning
4. Ensemble Methods:
Random Forest: Builds multiple de-correlated trees via bagging
Boosting (XGBoost): Sequentially corrects errors with weighted trees
Both reduce variance and improve generalization
Example: Single tree 75% accuracy → Random Forest 85% accuracy
PART C
1. Analyze the effectiveness of supervised, unsupervised, and reinforcement learning
paradigms in solving a real-world task like fraud detection or customer segmentation,
supporting your analysis with examples.
Learning paradigms analysis:
o Supervised for fraud detection (labeled fraud cases)
o Unsupervised for customer segmentation (no labels)
o Reinforcement for adaptive systems (sequential decisions)
o Compare effectiveness for each task
2. Examine and formulate an approach to determine whether a concept class is PAC-learnable,
discussing the factors that influence sample complexity in the model.
PAC-learnability approach:
o Define concept class
o Determine VC dimension
o Calculate sample complexity
o Discuss approximation parameters (ε, δ)
o Example analysis
3. Analyze a dataset containing student test scores and demographic data to determine which
supervised learning model (e.g., Decision Tree, Naïve Bayes, or Linear Regression) is most
suitable for predicting final grades. Justify your choice.
Model selection analysis:
o Dataset exploration
o Feature analysis
o Model suitability comparison
o Evaluation metric selection
o Final recommendation with justification
4. Design and analyze a predictive model using multilinear regression to forecast house prices
based on features like size, location, and age. Explain each step from data preparation to
model validation.
House price prediction:
o Data cleaning/preprocessing
o Feature selection/engineering
o Model formulation
o Coefficient interpretation
o Validation approach
o Error analysis