D.Y. Patil College of Engineering, Akurdi Department of Electronics & Telecommunication Engineering
D.Y. Patil College of Engineering, Akurdi Department of Electronics & Telecommunication Engineering
Experiment No:- 01
Theory:
The structure of artificial neural networks was based on the present understanding of
biological neural systems. The structure of biological neuron is shown in figure(1). The
computation is achieved by dense interconnection of simple processing units. To describe the
attributes of computing, the artificial neural networks go by many names such as
connectionist models, parallel distributed processors, or self-organizing system. With such
features, an artificial neural system has great potential in performing applications such as
speech and image recognition where intense computation can be done in parallel and the
computational elements are connected by weighted links.
The artificial neuron, the most fundamental computational unit, is modeled based on the basic
property of a biological neuron. This type of processing unit performs in two stages:
weighted summation and some type of nonlinear function. It accepts a set of inputs to
generate the weighted sum, then passes the result to the nonlinear function to make an output.
Unlike conventional computing systems, which have fixed instructions to perform specific
computations, the artificial neural network needs to be taught and trained to function
1
correctly. The advantage is that the neural system can learn new input-output patterns and
adjust the system parameters. Such learning can eliminate specifying instructions to be
executed for computations. Instead, users simply supply appropriate sample input-output
patterns to the network.
Following table shows the associated Terminologies of Biological and Artificial Neural Net
Biological Neural Network Artificial Neural Network
Cell Body Neurons
Dendrite Weight interconnections
Soma Net input
Axon Output
The early model of an artificial neuron is introduced by Warren McCulloch and Walter Pitts
in 1943. The McCulloch-Pitts neural model is also known as linear threshold gate. It is a
neuron of a set of inputs ( I1, I2, I3, …., Im) and one output y. The linear threshold gate
simply classifies the set of inputs into two different classes. Thus the output y is binary. Such
a function can be described mathematically using these equations:
and y = f(sum)
W1,W2,W3…Wm are weight values normalized in the range of either (0,1) or (-1,1) and
associated with each input line, sum is the weighted sum, and T is a threshold constant. The
function f is a linear step function at threshold T as shown in figure (2). The symbolic
representation of the linear threshold gate is shown in figure (3).
2
Figure 3: Symbolic Illustration of Linear Threshold Gate
This model is so simplistic that it only generates a binary output and also the weight and
threshold values are fixed.
Conclusion:-
Programm:
3
import numpy as np
defAND(x1, x2):
x = np.array([1, x1, x2])
w = np.array([-1.5, 1, 1])
y = np.sum(w*x)
if y <= 0:
return 0
else:
return 1
defOR(x1, x2):
x = np.array([1, x1, x2])
w = np.array([-0.5, 1, 1])
y = np.sum(w*x)
if y <= 0:
return 0
else:
return 1
defNAND(x1, x2):
x = np.array([1, x1, x2])
w = np.array([1.5, -1, -1])
y = np.sum(w*x)
if y <= 0:
return 0
else:
return 1
if __name__ == '__main__':
input = [(0, 0), (1, 0), (0, 1), (1, 1)]
print("AND")
for x in input:
y = AND(x[0], x[1])
print(str(x) + " -> " + str(y))
print("OR")
for x in input:
y = OR(x[0], x[1])
print(str(x) + " -> " + str(y))
print("NAND")
for x in input:
y = NAND(x[0], x[1])
print(str(x) + " -> " + str(y))
Output:
AND
4
(0, 0) -> 0
(1, 0) -> 0
(0, 1) -> 0
(1, 1) -> 1
OR
(0, 0) -> 0
(1, 0) -> 1
(0, 1) -> 1
(1, 1) -> 1
NAND
(0, 0) -> 1
(1, 0) -> 1
(0, 1) -> 1
(1, 1) -> 0
5
Department of Electronics & Telecommunication Engineering
Experiment No:- 02
Theory:
Conclusion:-
## Experiment no 2 ##
## implement simple linear regression model
7
import pandas as pd #making data
frames
import numpy as np # for numeric
function
import matplotlib.pyplot as plt #ploting the graph
from sklearn.linear_model import LinearRegression # import linear
regression function
mydata=pd.read_csv("linreg.csv") # import CSV file
print(mydata) # print data from
CSV file
Output:
runfile('E:/ML/exp2/linea.py', wdir='E:/ML/exp2')
area price
0 500 32
1 570 36
2 590 38
3 682 41
4 729 46
5 680 50
6 825 53
7 900 58
8 850 61
9 1000 66
8
9
D.Y. Patil College of Engineering, Akurdi
Department of Electronics & Telecommunication Engineering
Experiment No:- 03
Theory:
A multilayer perceptron (MLP) is a feedforward artificial neural network model that maps
sets of input data onto a set of appropriate outputs. An MLP consists of multiple layers of
nodes in a directed graph, with each layer fully connected to the next one. Except for the
input nodes, each node is a neuron (or processing element) with a nonlinear activation
function. MLP utilizes a supervised learning technique called back-propagation for training
the network. MLP is a modification of the standard linear perceptron and can distinguish data
that are not linearly separable.
The architecture of back propagation network is shown below:
Modes of learning:
There are two modes of learning to choose from: stochastic and batch.
In stochastic learning, each propagation is followed immediately by a weight update. In batch
learning many propagations occur before updating the weights, accumulating errors over the
samples within a batch.
Limitations
Gradient descent can find the local minimum instead of the global minimum
Gradient descent with backpropagation is not guaranteed to find the global minimum
of the error function, but only a local minimum; also, it has trouble crossing plateaux
in the error function landscape.
Backpropagation learning does not require normalization of input vectors; however,
normalization could improve performance.
11
Gradient descent can find the local minimum instead of the global minimum.
Conclusion:-
12
Exp no 3
import numpy as np
def sigmoid(x):
return 1.0/(1.0 + np.exp(-x))
def sigmoid_prime(x):
return sigmoid(x)*(1.0-sigmoid(x))
def tanh(x):
return np.tanh(x)
def tanh_prime(x):
return 1.0 - x**2
class NeuralNetwork:
# Set weights
self.weights = []
# layers = [2,2,1]
# range of weight values (-1,1)
# input and hidden layers - random((2+1, 2+1)) : 3 x 3
for i in range(1, len(layers) - 1):
r = 2*np.random.random((layers[i-1] + 1, layers[i] + 1)) -1
self.weights.append(r)
# output layer - random((2+1, 1)) : 3 x 1
r = 2*np.random.random( (layers[i] + 1, layers[i+1])) - 1
self.weights.append(r)
for k in range(epochs):
i = np.random.randint(X.shape[0])
a = [X[i]]
for l in range(len(self.weights)):
dot_value = np.dot(a[l], self.weights[l])
activation = self.activation(dot_value)
a.append(activation)
13
# output layer
error = y[i] - a[-1]
deltas = [error * self.activation_prime(a[-1])]
# reverse
# [level3(output)->level2(hidden)] => [level2(hidden)->level3(output)]
deltas.reverse()
# backpropagation
# 1. Multiply its output delta and input activation
# to get the gradient of the weight.
# 2. Subtract a ratio (percentage) of the gradient from the weight.
for i in range(len(self.weights)):
layer = np.atleast_2d(a[i])
delta = np.atleast_2d(deltas[i])
self.weights[i] += learning_rate * layer.T.dot(delta)
if __name__ == '__main__':
nn = NeuralNetwork([2,2,1])
X = np.array([[0, 0],
[0, 1],
[1, 0],
[1, 1]])
y = np.array([0, 1, 1, 0])
nn.fit(X, y)
for e in X:
print(e,nn.predict(e))
14
Output:
epochs: 0
epochs: 10000
epochs: 20000
epochs: 30000
epochs: 40000
epochs: 50000
epochs: 60000
epochs: 70000
epochs: 80000
epochs: 90000
(array([0, 0]), array([ 9.14891326e-05]))
(array([0, 1]), array([ 0.99557796]))
(array([1, 0]), array([ 0.99707463]))
(array([1, 1]), array([ 0.00090973]))
15
D.Y. Patil College of Engineering, Akurdi
Department of Electronics & Telecommunication Engineering
Experiment No:- 04
Theory:
A radial basis function network is an artificial neural network that uses radial basis
functions as activation functions. The output of the network is a linear combination of radial
basis functions of the inputs and neuron parameters. Radial basis function networks have
many uses, including function approximation, time series prediction, classification, and
system control. They were first formulated in a 1988 paper by Broomhead and Lowe, both
researchers at the Royal Signals and Radar Establishment.
RBF network is used for approximating functions and recognizing patterns. It uses Guassian
Potential Functions. Powell has used radial basis functions in exact interpolation. In
interpolation we have n data points xi Rd , and n real valued numbers ti R, where i=1,…n.
The task is to determine a function S in linear space such that’ S(xi) = ti , i=1,…n. The
interpolation function is a linear combination of basis functions.
Vi(x) = Φ( || x-xi || )
Where Φ is mapping R+ →R and the norm is Euclidean distance. The network uses the most
common non-linearity such as sigmoidal and Guassian kernel functions. The Guassian
functions are also used in regularization networks. The response of such a function is positive
for all values of y; the response decrease to 0as |y| →0. The guassian function is generally
defined as
16
f(y)=e-y2
the derivative of this function is given by,
f’(y)=-2y e-y2=-2yf(y)
When the Guassian potential functions are being used, each node is found to produce an
identical output for inputs existing within the fixed radial distance from the center of the
kernel, they are found to be radically symmetric, and hence the name radial basis function
network. The entire network forms a linear combination of the nonlinear basis function.
Conclusion:-
17
Exp 04
19
y_pred = rbfnet.predict(X)
plt.plot(X, y, '-o', label='true')
plt.plot(X, y_pred, '-o', label='RBF-Net')
plt.legend()
plt.tight_layout()
plt.show()
20
D.Y. Patil College of Engineering, Akurdi
Department of Electronics & Telecommunication Engineering
Experiment No:- 05
Aim: Implement Self Organizing Feature Map (SOFM) for character recognition.
Theory:
21
The K-SOFM architecture is shown in figure:
The training utilizes competitive learning. When a training example is fed to the network, its
Euclidean distance to all weight vectors is computed. The neuron whose weight vector is
most similar to the input is called the best matching unit (BMU). The weights of the BMU
and neurons close to it in the SOM lattice are adjusted towards the input vector. The
magnitude of the change decreases with time and with distance (within the lattice) from the
BMU. The update formula for a neuron v with weight vector.
Conclusion:-
22
D.Y. Patil College of Engineering, Akurdi
Department of Electronics & Telecommunication Engineering
Experiment No:- 06
Aim: Implement SVM classifier for classification of data into two classes student can use
dataset such as flower classification
Theory:
Support Vectors are simply the co-ordinates of individual observation. Support Vector
Machine is a frontier which best segregates the two classes (hyper-plane/ line).
You can look at definition of support vectors and a few examples of its working here.
How does it work?
Above, we got accustomed to the process of segregating the two classes with a hyper-plane.
Now the burning question is “How can we identify the right hyper-plane?”. Don’t worry, it’s
not as hard as you think!
Let’s understand:
Identify the right hyper-plane (Scenario-1): Here, we have three hyper-planes (A, B
and C). Now, identify the right hyper-plane to classify star and circle.
23
You need to remember
a thumb rule to identify the right hyper-plane: “Select the hyper-plane which
segregates the two classes better”. In this scenario, hyper-plane “B” has excellently
performed this job.
Identify the right hyper-plane (Scenario-2): Here, we have three hyper-planes (A, B
and C) and all are segregating the classes well. Now, How can we identify the right
hyper-plane?
Above, you can see that the margin for hyper-plane C is high as compared to both A and B.
Hence, we name the right hyper-plane as C. Another lightning reason for selecting the hyper-
24
plane with higher margin is robustness. If we select a hyper-plane having low margin then
there is high chance of miss-classification.
Identify the right hyper-plane (Scenario-3):Hint: Use the rules as discussed in
previous section to identify the right hyper-plane
As I have already
mentioned, one star at other end is like an outlier for star class. SVM has a feature to
ignore outliers and find the hyper-plane that has maximum margin. Hence, we can
say, SVM is robust to outliers.
25
Find the hyper-plane to segregate to classes (Scenario-5): In the scenario below, we
can’t have linear hyper-plane between the two classes, so how does SVM classify
these two classes? Till now, we have only looked at the linear hyper-plane.
26
Simply put, it does some extremely complex data transformations, then find out the
process to separate the data based on the labels or outputs you’ve defined.When we look at
the hyper-planein original input space it looks like a circle:
Now, let’s look at the methods to apply SVM algorithm in a data science challenge.
Conclusion:-
27
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets
X = np.array([
[0, 0],
[0, 1],
[1, 0],
[1, 1]
])
y = np.array([0, 0, 0, 1])
clf=svm.SVC(kernel='linear', C=1e6)
clf.fit(X, y)
plt.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap=plt.cm.Paired)
Output:
28
D.Y. Patil College of Engineering, Akurdi
Department of Electronics & Telecommunication Engineering
Experiment No:- 07
Theory:
Module overview
This article describes how to use the Two-Class Support Vector Machine module in Azure
Machine Learning Studio, to create a model that is based on the support vector machine
algorithm.
Support vector machines (SVMs) are a well-researched class of supervised learning methods.
This particular implementation is suited to prediction of two possible outcomes, based on
either continuous or categorical variables.
After defining the model parameters, train the model by using one of the training modules,
and providing a tagged dataset that includes a label or outcome column.
More about support vector machines
Support vector machines are among the earliest of machine learning algorithms, and SVM
models have been used in many applications, from information retrieval to text and image
classification. SVMs can be used for both classification and regression tasks.
This SVM model is a supervised learning model that requires labeled data. In the training
process, the algorithm analyzes input data and recognizes patterns in a multi-dimensional
feature space called the hyperplane. All input examples are represented as points in this
space, and are mapped to output categories in such a way that categories are divided by as
wide and clear a gap as possible.
For prediction, the SVM algorithm assigns new examples into one category or the other,
mapping them into that same space.
How to configure Two-Class Support Vector Machine
For this model type, it is recommended that you normalize the dataset before using it to train
the classifier.
29
Single Parameter: If you know how you want to configure the model, you
can provide a specific set of values as arguments.
Parameter Range: If you are not sure of the best parameters, you can find the
optimal parameters by specifying multiple values and using the Tune Model
Hyperparameters module to find the optimal configuration. The trainer iterates
over multiple combinations of the settings and determines the combination of
values that produces the best model.
2. For Number of iterations, type a number that denotes the number of iterations used
when building the model.
This parameter can be used to control trade-off between training speed and accuracy.
This regularization coefficient can be used to tune the model. Larger values penalize
more complex models.
If you apply normalization, before training, data points are centered at the mean and
scaled to have one unit of standard deviation.
Projecting values to unit space means that before training, data points are centered at 0
and scaled to have one unit of standard deviation.
6. In Random number seed, type an integer value to use as a seed if you want to ensure
reproducibility across runs. Otherwise, a system clock value is used as a seed, which
can result in slightly different results across runs.
7. Select the option, Allow unknown category, to create a group for unknown values in
the training or validation sets. In this case, the model might be less precise for known
values, but it can provide better predictions for new (unknown) values.
If you deselect it, the model can accept only the values that are contained in the training
data.
Note
If you pass a parameter range to Train Model, it will use only the first value in the
parameter range list.
30
If you pass a single set of parameter values to the Tune Model
Hyperparameters module, when it expects a range of settings for each parameter, it
ignores the values and using the default values for the learner.
If you select the Parameter Range option and enter a single value for any parameter,
that single value you specified will be used throughout the sweep, even if other
parameters change across a range of values.
Results
To see a summary of the model's parameters, together with the feature weights
learned from training, , right-click the output of Train Model or Tune Model Hyper
parameters, and select Visualize.
To use the trained models to make predictions, connect the trained model to the Score
Model module.
To perform cross-validation against a labeled data set, connect the untrained model
and the dataset to Cross-Validate Model.
Examples
For examples of how this learning algorithm is used, see the Azure AI Gallery:
Technical notes
This section contains implementation details, tips, and answers to frequently asked questions.
Usage tips
For this model type, it is recommended that you normalize the dataset before using it to train
the classifier.
Although recent research has developed algorithms that have higher accuracy, this algorithm
can work well on simple data sets when your goal is speed over accuracy. If you do not get
the desired results by using Two-Class Support Vector Model, try one of these
classification methods:
Module parameters
31
Name Range Type Default Description
Output
Name Type Description
Conclusion:-
32
Exp no 07:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
bankdata = pd.read_csv("D:/Datasets/bill_authentication.csv")
bankdata.shape
bankdata.head()
X = bankdata.drop('Class', axis=1)
y = bankdata['Class']
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20)
from sklearn.svm import SVC
svclassifier = SVC(kernel='linear')
svclassifier.fit(X_train, y_train)
from sklearn.metrics import classification_report, confusion_matrix
print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))
output:
[[152 0] [ 1 122]]
precision recall f1-score support
33
D.Y. Patil College of Engineering, Akurdi
Department of Electronics & Telecommunication Engineering
Experiment No:- 08
Title: Study and Implement and test CNN for object recognition
Aim: Implement and test Convolution Neural Network for object recognition.
Artificial Intelligence has been witnessing a monumental growth in bridging the gap
between the capabilities of humans and machines. Researchers and enthusiasts alike,
work on numerous aspects of the field to make amazing things happen. One of many
such areas is the domain of Computer Vision.
The agenda for this field is to enable machines to view the world as humans do,
perceive it in a similar manner and even use the knowledge for a multitude of tasks
such as Image & Video recognition, Image Analysis & Classification, Media
Recreation, Recommendation Systems, Natural Language Processing, etc. The
advancements in Computer Vision with Deep Learning has been constructed and
perfected with time, primarily over one particular algorithm.
Introduction
34
A CNN sequence to classify handwritten digits
A Convolutional Neural Network (ConvNet/CNN):
It is a Deep Learning algorithm which can take in an input image, assign importance
(learnable weights and biases) to various aspects/objects in the image and be able to
differentiate one from the other. The pre-processing required in a ConvNet is much lower as
compared to other classification algorithms. While in primitive methods filters are hand-
engineered, with enough training, ConvNets have the ability to learn these
filters/characteristics.
35
Why ConvNets over Feed-Forward Neural Nets?
Input Image
4x4x3 RGB Image
36
In the figure, we have an RGB image which has been separated by its three color
planes — Red, Green, and Blue. There are a number of such color spaces in which images
exist — Grayscale, RGB, HSV, CMYK, etc.
You can imagine how computationally intensive things would get once the images
reach dimensions, say 8K (7680×4320). The role of the ConvNet is to reduce the images into
a form which is easier to process, without losing features which are critical for getting a good
prediction. This is important when we are to design an architecture which is not only good at
learning features but also is scalable to massive datasets.
Convolution Layer — The Kernel
Convoluting a 5x5x1 image with a 3x3x1 kernel to get a 3x3x1 convolved feature
Image Dimensions = 5 (Height) x 5 (Breadth) x 1 (Number of channels, eg. RGB)In the above
demonstration, the green section resembles our 5x5x1 input image, I. The element involved
in carrying out the convolution operation in the first part of a Convolutional Layer is called
the Kernel/Filter, K, represented in the color yellow. We have selected K as a 3x3x1
matrix.image over which the kernel is hovering.
37
Movement of the Kernel
The filter moves to the right with a certain Stride Value till it parses the complete
width. Moving on, it hops down to the beginning (left) of the image with the same Stride
Value and repeats the process until the entire image is traversed.
38
Convolution Operation with Stride Length = 2
The objective of the Convolution Operation is to extract the high-level features such
as edges, from the input image. ConvNets need not be limited to only one Convolutional
Layer. Conventionally, the first ConvLayer is responsible for capturing the Low-Level
features such as edges, color, gradient orientation, etc. With added layers, the architecture
adapts to the High-Level features as well, giving us a network which has the wholesome
understanding of images in the dataset, similar to how we would.
There are two types of results to the operation — one in which the convolved feature is
reduced in dimensionality as compared to the input, and the other in which the dimensionality
is either increased or remains the same. This is done by applying Valid Padding in case of the
former, or Same Padding in the case of the latter.
39
On the other hand, if we perform the same operation without padding, we are
presented with a matrix which has dimensions of the Kernel (3x3x1) itself — Valid Padding.
The following repository houses many such GIFs which would help you get a better
understanding of how Padding and Stride Length work together to achieve results relevant to
our needs.
vdumoulin/conv_arithmetic
A technical report on convolution arithmetic in the context of deep learning -
vdumoulin/conv_arithmeticgithub.com
Pooling Layer
40
The Convolutional Layer and the Pooling Layer, together form the i-th layer of a
Convolutional Neural Network. Depending on the complexities in the images, the number of
such layers may be increased for capturing low-levels details even further, but at the cost of
more computational power.
After going through the above process, we have successfully enabled the model to
understand the features. Moving on, we are going to flatten the final output and feed it to a
regular Neural Network for classification purposes.
Classification — Fully Connected Layer (FC Layer)
Conclusion:-
41