0% found this document useful (0 votes)
22 views

COS4852 2024 Assignment 2

This document serves as Tutorial Letter A2 for the COS4852 Machine Learning course for the year 2024, containing the questions for Assignment 2. It includes various tasks related to Perceptrons and neural networks, requiring students to analyze decision boundaries, calculate weights and biases, and design neural networks for given datasets. The assignment is structured into multiple questions, each focusing on different aspects of machine learning concepts and their applications.

Uploaded by

baboledimasemola
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

COS4852 2024 Assignment 2

This document serves as Tutorial Letter A2 for the COS4852 Machine Learning course for the year 2024, containing the questions for Assignment 2. It includes various tasks related to Perceptrons and neural networks, requiring students to analyze decision boundaries, calculate weights and biases, and design neural networks for given datasets. The assignment is structured into multiple questions, each focusing on different aspects of machine learning concepts and their applications.

Uploaded by

baboledimasemola
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

COS4852/A2/0/2024

Tutorial Letter A2/0/2024

Machine Learning
COS4852

Year module

School of Computing

IMPORTANT INFORMATION

This document contains the questions for Assignment 2 for COS4852 for 2024.

BAR CODE

university
Learn without limits. of south africa
CONTENTS

Page

1 INTRODUCTION ..................................................................................................................3

2 Assignment 2 ......................................................................................................................3

2
COS4852/A2

1 INTRODUCTION

This document contains the questions for on Assignment 2 for COS4852 for 2024.

2 Assignment 2

Question 1

Read Chapter 4 of Nilsson’s book (which you downloaded for Assignment 1). Take special note
of section 4.1 and its discussion of decision boundaries and their polarity and the concept of the
neural network weight-space. The terms TLU and Perceptron refer to the same construct here (i.e.
a neuron with weights and a threshold activation function). Similarly, the terms hyperplane, decision
boundary, and decision surface refer to the same concept, which linearly divides a space into two
sub-spaces. The space can have any number of dimensions. In a 2-dimensional space the decision
boundary is a straight line. There is a direct mapping between the decision boundary(s) and the
weights of the neural network.

Go to: https://www.thomascountz.com/2018/04/13/calculate-decision-boundary-of-perceptron. This


shows you how to draw the decision boundary for a Perceptron using the known weights. We simply
use linear algebra of a line.

You can also look at https://www.thomascountz.com/2018/03/23/perceptrons-in-neural-networks.

Question 1(a)

Consider the Perceptron in Figure 1. The bias and weights are shown in the figure. It uses the

w0p1 = 6

x w1p1 = -1

p1

y w2p1 = -3

Figure 1: Neuron with weights and bias values.

3
threshold activation function: (
1 z≥0
fthreshold (z) =
0 z<0
The Perceptron classifies the following instances:

P1 = (-4, 4)
P2 = (1, 4)
P3 = (2, 6)
P4 = (6, 2)
N1 = (-3, 1)
N2 = (1, -5)
N3 = (6, -4)
N4 = (4, -2)

Positive instances are marked as Pi and negative instance as Ni .

i) Draw the input (x, y)-space for this Perceptron showing the 8 instances.

ii) Calculate where the decision boundary will be and draw that in the input space. Show all your
steps and calculations.

iii) Does the Perceptron classify the data correctly? If so, prove it. If not, show what can be done
to correct this. Show all your steps and calculations (HINT: you do not have to re-calculate the
weights).

Question 1(b)

The following 10 instances are marked as positive (Pi ) and negative (Ni ). A single Perceptron is
used to divide the input space into two classes based on the instances. The resulting decision
boundary/hyperplane is shown in Figure 2.

P1 = (4, 1)
P2 = (2, −1)
P3 = (2, 2)
P4 = (5, 5)
P5 = (3, −2)
N1 = (−3, 4)
N2 = (−1, 1)
N3 = (3, 6)
N4 = (−3, −1)
N5 = (−2, 3)

The cut-off points of the hyperplane are at (0, 1.25) and (-1, 0).

4
COS4852/A2

6 N3

5 P4

N1 4

N5 3

2 P3
(0,1.25)
N21 P1

x
-5 -4 -3 -2 -1 (-1, 0) 1 2 3 4 5 6

N4 -1 P2

-2 P5

-3

Figure 2: A hyperplane of a Perceptron dividing the input space into positive and negative, using
the 10 instances shown.

5
Calculate the values of the weights and bias of the Perceptron, using the given decision surface/hy-
perplane.

Question 1(c)

Consider again the same 10 instances.

P1 = (4, 1)
P2 = (2, −1)
P3 = (2, 2)
P4 = (5, 5)
P5 = (3, −2)
N1 = (−3, 4)
N2 = (−1, 1)
N3 = (3, 6)
N4 = (−3, −1)
N5 = (−2, 3)

A neural network comprising of 2 input neurons (for x and y values), 2 hidden layer neurons (h1 and
h2 ), and a single output neuron, was trained to classify the data. The hidden and output neurons use
the threshold activation function, which outputs either 0 (for a negative instance) or 1 for a positive
instance.

w0h1 = 1

w1h1 = 1 w0o1 = -3
x h1
w1o1 = 2
w2h1 = -1
o1
w1h2 = 3
w2o1 = 2
y h2
w2h2 = 1

w0h2 = -3

Figure 3: Three-layer neural network with weights and bias values.

6
COS4852/A2

i) Calculate the output for each of the three neurons h1 , h2 , and o1 when N3 is given as input to
the network. Use the weights and bias values as shown in Figure 3.

ii) Does the network classify N3 correctly?

iii) What do you observe about the outputs of h1 and h2 when N3 is given as input? Could the
network still correctly classify N3 ? Explain in detail.

iv) Draw the decision boundaries for each of the three neurons, h1 , h2 , and o1 . Show all your
steps and calculations. Explain what you did to arrive at the diagram for o1 .

Mark out of 100.


40 or less for clear indication that student does not understand the topic or evidence of plagiarism
50 for a fair understanding
60-70 for understanding and clear well defined examples
80+ for exceptional detail

7
Question 2

Design neural networks, using one or more Perceptron neurons, for the data as detailed below.

Let your Perceptron use the threshold activation function:


(
1 z≥0
fthreshold (z) =
0 z<0

Figure 4: Threshold activation function.

For each sub-question, draw diagrams to show the decision boundaries that your network uses, and
the structure of the neural network showing the weights that will correctly classify the function. Show
the arguments for your choices, all your assumptions, definitions, and calculations. Prove that the
weights you use will correctly classify the function.

Note that this question does not ask you to train a neural network using one of the neural network
algorithms. The purpose here is similar to the question on the relationship between decision trees
and Boolean expressions, in the previous assignment. It is possible to directly map a specific
Boolean expression to a network of Perceptrons. Rojas has a very good explanation of the concepts
needed in Chapter 2 (and some aspects on XOR in Chapter 6).

Question 2(a)

Data set 1:

P1 = (-3, 3)
P2 = (3, -3)
P3 = (3, 3)
N1 = (-3, -3)

8
COS4852/A2

Question 2(b)

Data set 2:

P1 = (−3, −3)
P2 = (3, 3)
N1 = (−3, 3)
N2 = (3, −3)

Mark out of 100.


40 or less for clear indication that student does not understand the topic or evidence of plagiarism
50 for a fair understanding
60-70 for understanding and clear well defined examples
80+ for exceptional detail

9
Question 3

Question 3(a)

Find the original 1986 article by Rumelhart, Hinton, and Williams that introduced the B ACKPROPA -
GATION algorithm, and read it. Give the URL where you found it.

Question 3(b)

Find the following sites and study them:

https://brilliant.org/wiki/artificial-neural-network/
https://brilliant.org/wiki/feedforward-neural-networks/
https://brilliant.org/wiki/backpropagation/ https://www.youtube.com/watch?v=
The following Python code implements a basic Backpropagation algorithm.
import numpy as np
import m a t p l o t l i b as mpl
import m a t p l o t l i b . p y p l o t as p l t

# a c t i v a t i o n f u n c t i o n s and t h e i r d e r i v a t i v e s
def sigmoid ( x ) :
r e t u r n 1 / ( 1 + np . exp(−x ) )

def s i g m o i d _ d e r i v a t i v e ( x ) :
return x * (1 − x )

def r e l u ( x ) :
# modify t h i s code
return x

def r e l u _ d e r i v a t i v e ( x ) :
# modify t h i s code
return x

def l e a k y _ r e l u ( x , alpha ) :
# modify t h i s code
return x

def l e a k y _ r e l u _ d e r i v a t i v e ( x , alpha ) :
# modify t h i s code
return x

# activation function selection


a c t i v a t i o n _ f u n c t i o n = ’ sigmoid ’
#activation_function = ’ relu ’
# activation_function = ’ leaky_relu ’

10
COS4852/A2

# a p p l y i n g t h e chosen a c t i v a t i o n f u n c t i o n
def a c t i v a t e ( x , f u n c = a c t i v a t i o n _ f u n c t i o n ) :
i f f u n c == ’ r e l u ’ :
return r e l u ( x )
e l i f f u n c == ’ l e a k y _ r e l u ’ :
r e t u r n l e a k y _ r e l u ( x , alpha )
else : # d e f a u l t t o sigmoid i f n o t s p e c i f i e d
r e t u r n sigmoid ( x )

def a c t i v a t e _ d e r i v a t i v e ( x , f u n c = a c t i v a t i o n _ f u n c t i o n ) :
i f f u n c == ’ r e l u ’ :
return r e l u _ d e r i v a t i v e ( x )
e l i f f u n c == ’ l e a k y _ r e l u ’ :
r e t u r n l e a k y _ r e l u _ d e r i v a t i v e ( x , alpha )
else : # d e f a u l t t o sigmoid i f n o t s p e c i f i e d
return s i g m o i d _ d e r i v a t i v e ( x )

# t r a i n i n g data
X = np . a r r a y ( [
[0 , 0 , 1] ,
[0 , 1 , 0] ,
[0 , 1 , 1] ,
[1 , 0 , 0] ,
[1 , 0 , 1] ,
[1 , 1 , 0]
])
y = np . a r r a y ( [
[1] ,
[1] ,
[0] ,
[0] ,
[1] ,
[0]
])

# i n i t i a l i s e weights
np . random . seed ( 6 )
i n p u t _ s i z e = X . shape [ 1 ]
hidden_size = 4
output_size = 1
w e i g h t s _ i n p u t _ h i d d e n = 2 * np . random . random ( ( i n p u t _ s i z e , h i d d e n _ s i z e ) ) − 1
w e i g h t s _ h i d d e n _ o u t p u t = 2 * np . random . random ( ( hidden_size , o u t p u t _ s i z e ) ) − 1

# training
l e a r n i n g _ r a t e = 0.1
alpha = 0.01
epochs = 10000
loss_values = [ ]

f o r epoch i n range ( epochs ) :

11
# f o r w a r d pass
h i d d e n _ l a y e r _ i n p u t = np . d o t ( X , w e i g h t s _ i n p u t _ h i d d e n )
hidden_layer_output = act iv at e ( hidden_layer_input , a c t i v a t i o n _ f u n c t i o n )
f i n a l _ o u t p u t = a c t i v a t e ( np . d o t ( h i d d e n _ l a y e r _ o u t p u t , w e i g h t s _ h i d d e n _ o u t p u t ) , a c t i v

# compute e r r o r
error = y − final_output
l o s s = np . mean ( np . abs ( e r r o r ) )
l o s s _ v a l u e s . append ( l o s s )

# backward pass
d_predicted_output = error * a c t i v a t e _ d e r i v a t i v e ( final_output , a c t i v a t i o n _ f u n c t i o n
error_hidden_layer = d_predicted_output . dot ( weights_hidden_output . T)
d_hidden_layer = error_hidden_layer * a c t i v a t e _ d e r i v a t i v e ( hidden_layer_output , a c

# u p d a t i n g Weights
w e i g h t s _ h i d d e n _ o u t p u t += h i d d e n _ l a y e r _ o u t p u t . T . d o t ( d _ p r e d i c t e d _ o u t p u t ) * l e a r n i n g
w e i g h t s _ i n p u t _ h i d d e n += X . T . d o t ( d _ h i d d e n _ l a y e r ) * l e a r n i n g _ r a t e

i f epoch % 1000 == 0 :
l o s s = np . mean ( np . abs ( e r r o r ) )
p r i n t ( f ’ Epoch { epoch } , Loss : { l o s s } ’ )

# f i n a l output a f t e r t r a i n i n g
print ( f " Activation function : { activation_function } " )
print ( " Final outputs a f t e r t r a i n i n g : " )
print ( final_output )

# Plotting
p l t . f i g u r e ( f i g s i z e =(10 , 2 ) )
p l t . p l o t ( loss_values )
y_max = 0.75
x_max = epochs
p l t . x t i c k s ( np . arange ( 0 , x_max + 1 , 1 0 0 0 ) )
p l t . y t i c k s ( np . arange ( 0 , y_max , 0 . 1 ) )
p l t . x l a b e l ( ’ Epoch ’ )
p l t . y l a b e l ( ’ Loss ’ )
p l t . y l i m ( 0 , y_max )
p l t . x l i m ( 0 , epochs )
p l t . g r i d ( True )
p l t . show ( )

Consider the data in Table 1. This same data is used in the Python code.

Copy and execute the code in a Python 3 interpreter or a Jupyter notebook. Copy the output of your
program into your answer. Explain what the output says about the training of the network. Explain
the output values in terms of the training data in Table 1.

12
COS4852/A2

Question 3(c)

Consider function f5 :

x1 x2 x3 f5 (x1 , x2 , x3 )
0 0 1 1
0 1 0 1
0 1 1 0
1 0 0 0
1 0 1 1
1 1 0 0

Table 1: f5 (x1 , x2 , x3 )

Modify the code to set the random seed (HINT: using np.random.seed(11) in the right place in
the code sets the seed to 11, for example). Use the data from function f5 , and perform the following
experiments:

(1) Run the code with f5 using your new random seed.

(2) Modify the code to use only 2 hidden-layer neurons with f5 . Try different random seeds.

(3) Modify the code to use only 1 hidden-layer neuron with f5 . Try different random seeds.

(4) Add code to implement the R ELU and L EAKY-R ELU functions and their derivatives. Keep in
mind that these functions uses an α (alpha) parameter that you need to include in the code.
Set α = 0.01. Run your code with 4 hidden neurons and use f5 for training. Again, try different
random seeds.

(5) Repeat the last experiment with L EAKY-R ELU with various α values.

Show listings of your modifications to the code. Show the outputs from the code. Discuss the results
of these experiments, in terms of:

(i) Whether the training was successful or not.

(ii) How long it takes to train successfully.

(iii) What happens to the Loss rate.

(iv) What the output results mean.

(v) Explain why some were successful and some not.

(vi) In the experiments where you adjust the α value, explain why the training fails at some point.

13
Here is a link that explains the most common activation functions used in Backpropagation:
https://www.baeldung.com/cs/ml-nonlinear-activation-functions

Question 3(d)

Discuss the different activation functions that can be used in feedforward neural networks, how they
work, why they allow training to happen in various algorithms, and their pros and cons. Provide the
links of sources you used.

Mark out of 100.


40 or less for clear indication that student does not understand the topic or evidence of plagiarism
50 for a fair understanding
60-70 for understanding and clear well defined examples
80+ for exceptional detail

14
COS4852/A2

Question 4

Question 4(a)

Find and read material on Autoencoders in neural networks. Here are some sources to get you
started:
https://www.ibm.com/topics/autoencoder
https://www.datacamp.com/tutorial/introduction-to-autoencoders
https://www.tensorflow.org/tutorials/generative/autoencoder
https://en.wikipedia.org/wiki/Autoencoder

Copy the following code into a Python 3 interpreter or a Jupyter notebook and execute it. You may
need to do this a few times, as the weights are random initial values. From what you’ve learned
about autoencoders, think carefully when the training will be successful.
# imports
import t o r c h
import t o r c h . nn as nn
import t o r c h . optim as optim

# d e f i n e autoencoder model
class Autoencoder ( nn . Module ) :
def _ _ i n i t _ _ ( s e l f ) :
super ( Autoencoder , s e l f ) . _ _ i n i t _ _ ( )
s e l f . encoder = nn . L i n e a r ( 8 , 3 )
s e l f . decoder = nn . L i n e a r ( 3 , 8 )

def f o r w a r d ( s e l f , x ) :
x = t o r c h . sigmoid ( s e l f . encoder ( x ) )
x = t o r c h . sigmoid ( s e l f . decoder ( x ) )
return x

# o u t p u t t e n s o r i n a more r e a d a b l e f o r m a t
def p r i n t _ r e a d a b l e _ t e n s o r ( t e n s o r ) :
r e a d a b l e _ t e n s o r = t o r c h . round ( t e n s o r * 1000) / 1000
print ( readable_tensor )

# data
data = t o r c h . t e n s o r ( [
[0 ,0 ,0 ,0 ,0 ,0 ,0 ,1] ,
[0 ,0 ,0 ,0 ,0 ,0 ,1 ,0] ,
[0 ,0 ,0 ,0 ,0 ,1 ,0 ,0] ,
[0 ,0 ,0 ,0 ,1 ,0 ,0 ,0] ,
[0 ,0 ,0 ,1 ,0 ,0 ,0 ,0] ,
[0 ,0 ,1 ,0 ,0 ,0 ,0 ,0] ,
[0 ,1 ,0 ,0 ,0 ,0 ,0 ,0] ,
[1 ,0 ,0 ,0 ,0 ,0 ,0 ,0]
] , dtype= t o r c h . f l o a t 3 2 )

15
# i n i t i a l i s e t h e model , l o s s f u n c t i o n , and o p t i m i s e r
model = Autoencoder ( )
c r i t e r i o n = nn . MSELoss ( )
o p t i m i s e r = optim . Adam( model . parameters ( ) , l r =0.001)

# t r a i n i n g loop
epochs = 20000
f o r epoch i n range ( epochs ) :
o u t p u t = model ( data )
l o s s = c r i t e r i o n ( o u t p u t , data )

o p t i m i s e r . zero_grad ( )
l o s s . backward ( )
o p t i m i s e r . step ( )

i f ( epoch +1) % 1000 == 0 :


p r i n t ( f ’ Epoch [ { epoch + 1 } / { epochs } ] , Loss : { l o s s . i t e m ( ) : . 4 f } ’ )

# t e s t t h e model
w i t h t o r c h . no_grad ( ) :
encoded = t o r c h . sigmoid ( model . encoder ( data ) )
decoded = model ( data )
p r i n t ( ’ Encoded R e p r e s e n t a t i o n : ’ )
p r i n t _ r e a d a b l e _ t e n s o r ( encoded )
p r i n t ( ’ Decoded Output : ’ )
p r i n t _ r e a d a b l e _ t e n s o r ( decoded )

Show the output from the program. What does the progress of the Loss values tell us about what is
happening to the training. From what you’ve learned about autencoders, explain in detail what the
two tensors tell us about what the network is doing with encoding and decoding the dataset.

Mark out of 100.


40 or less for clear indication that student does not understand the topic or evidence of plagiarism
50 for a fair understanding
60-70 for understanding and clear well defined examples
80+ for exceptional detail

© Unisa 2024

16

You might also like