0% found this document useful (0 votes)
10 views

L17-Perceptron

The document discusses the history and fundamentals of the perceptron, an early neural network learning model invented by Frank Rosenblatt in the 1960s. It explains the perceptron's goal of finding a hyperplane to separate data into two classes, its binary classification capabilities, and the learning rules involved in training the model. Additionally, it highlights the limitations of the perceptron, particularly in terms of linear separability and the need for multilayer networks for more complex problems.

Uploaded by

Ed Z
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

L17-Perceptron

The document discusses the history and fundamentals of the perceptron, an early neural network learning model invented by Frank Rosenblatt in the 1960s. It explains the perceptron's goal of finding a hyperplane to separate data into two classes, its binary classification capabilities, and the learning rules involved in training the model. Additionally, it highlights the limitations of the perceptron, particularly in terms of linear separability and the need for multilayer networks for more complex problems.

Uploaded by

Ed Z
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Perceptron

Foundations of Data Analysis

April 12, 2022


History of Perceptron

■ Frank Rosenblatt
■ 1928-1969

invented perceptron algorithm 2


History of Perceptron

■ Mark 1 Perceptron (1958)


■ 20 x 20 pixel camera
■ Hardware, not software!

"an electronic computer that [the Navy] expects will be able to walk, talk,
see, write, reproduce itself and be conscious of its existence"

- NY Times, 1958 3
Perceptron Learning Algorithm

■ First neural network learning model in the 1960’s

4
Perceptron Learning Algorithm

■ First neural network learning model in the 1960’s


■ Simple and limited (single layer model)

5
Perceptron Learning Algorithm

■ First neural network learning model in the 1960’s


■ Simple and limited (single layer model)
■ Basic concepts are similar to multi-layer models

6
What is Perceptron?
The goal of perceptron algorithm is to find a hyperplane that
separates a set of data into two classes.

Hyperplane (decision boundary)

Class 1

Class 0

7
What is Perceptron?
The goal of perceptron algorithm is to find a hyperplane that
separates a set of data into two classes.

Hyperplane (decision boundary)

Class 1
• Binary classifier

• Supervised learning

Class 0

8
Perceptron

bias term
Class 1
1, if w · x + ✓ > 0
f (x) = 1, w · x + b > 0
0, otherwise

P
n
Class 0 w·x= wi xi (dot product)
i=1

9
Perceptron
Net Activation funciton Output

x1 w1

x2 w2 Z

1, if w · x > ✓
z= 0, w · x <= ✓
xn wn

P
n
w·x= w i xi
i=1

• Learning weights such that an objective function is minimized


10
Activation Function
Outputs the label given an input or a set of inputs.

⎧ 1 if x ≥ 0
f (x) := sgn(x) = ⎨
⎩−1 if x < 0
Step function


f (x) = max(0, x)

ReLU (rectified linear unit)

1
f (x) := σ (x) =
1+ e(−ax )

Sigmoid function 11
Perceptron as a Single Layer Neuron

12
Examples
✓ = 0.2
.9 .5

.1 -.3
Z z =?

1, if w · x > ✓
z= 0, w · x <= ✓

.8 .4

.6 .2 Z z =?
.2 -.5 ✓ = 0.1
13
How to Learn Perceptron?

Class 1
1, if w · x + ✓ > 0
f (x) = 1, w · x + b > 0
0, otherwise

Class 0 w, ✓ are unknown parameters

14
How to Learn Perceptron?

Class 1
1, if w · x + ✓ > 0
f (x) = 1, w · x + b > 0
0, otherwise

Class 0 w, ✓ are unknown parameters

■ In supervised learning the network has its output


compared with known correct answers
■ Supervised learning
■ Learning with a teacher
15
CS 478 - Perceptrons 16
Perceptron Learning Rules
■ Consider linearly separable problems
■ How to find appropriate weights
■ Look if the output result o belongs to the desired class
has the desired value d (give labels)

new old
P
w =w + 4w 4w = ⌘ (d o)xi
i

η is called the learning rate, with 0 < η ≤ 1

Perceptron Convergence Theorem: Guaranteed to find a solution


in finite time if a solution exists

17
Perceptron Learning Rules

■ The algorithm converges to the correct classification


if and only if the training data is linearly separable
■ When assigning a value to η we must keep in mind two
conflicting requirements
■ Averaging of past inputs to provide stable weights
estimates, which requires small η
■ Fast adaptation with respect to real changes in the
underlying distribution, which requires large η

18
Linear Separability

19
Limited Functionality of Hyperplane

CS 478 - Perceptrons 20
Multilayer Network

output layer n
o1 = sgn(∑ w1i x i )
i= 0

hidden layer
n

o2 = sgn(∑ w 2i x i )
i= 0

input layer

21

You might also like