Neural Network Primitives Part 2 – Perceptron Model (1957)

Introduction

Perceptron was conceptualized by Frank Rosenblatt in the year 1957 and it is the most primitive form of artificial neural networks.

Welcome to part 2 of Neural Network Primitives series where we are exploring the historical forms of artificial neural network that laid the foundation of modern deep learning of 21st century.

In part 1, we understood how McCulloch-Pitts neuron was the first inspiration for representing biological neuron’s behavior in a mathematical model. But it had very limiting capabilities and was not a true machine learning model either.

You can read more about how McCulloch-Pitts neuron works in our Part 1 below –

In this post, we will see how Perceptron built upon the idea, laid down by McCulloch-Pitts neuron, to create true machine learning model.

Limitation of McCulloch-Pitts Neuron

  1. McCulloch-Pitts neuron could only take boolean values as inputs. But the real world problems are not limited to only boolean values. For e.g. you cannot give real number like age, price, area etc. as inputs.
  2. It does not considers weights to the inputs. Weights are very important to indicate which of the input features plays more important role in output and which features plays very little role. So without weights it cannot determine the importance of input data.
  3. The only parameter which is there in this model is Threshold Θ and that too we are computing it on our own. There is no learning taking place to determine the optimum value of Threshold based on past data. So this is not a true machine learning model.

Keeping these drawbacks in mind, let us now see how Perceptron Model works.

Perceptron Model

Let us first understand anatomy of a Perceptron. Below are it’s various parts –

  • Inputs
  • Weights
  • Neuron
  • Output

Inputs

The inputs to perceptron is no longer limited to boolean. The input attributes can represent any real number.

Weights

Each input has a weight associated with it in perceptron model. These weights are initially unknown but are learned by Perceptron during training phase.

For ease of mathematical representation, even Threshold Θ is considered as weight whose input value is 1.

Neuron

Neuron is a computational unit which has incoming input signals. The input signals are computed and an output is fired. The neuron further consists of following two elements –

Summation Function

This simply calculates the sum of incoming inputs multiplied by it’s respective weights.

Activation Function

Here the activation function used is the step function. It sees if the summation is more than equal to calculated Threshold value , if yes then neuron should fire (i.e. output =1 ) if not the neuron should not fire (i.e. output =0).

  • Neuron fires: Output =1 , if Summation(Inputs*Weights) >= Threshold
  • Neuron does not fire: Output =0 , if Summation(Inputs*Weights)  < Threshold

Though above condition looks very simple, if we apply a little mathematics, the condition can be represented as following which is more acceptable convention in machine learning community.

  • Neuron fires: Output =1 , if Summation(Inputs*Weights) – Threshold >= 0
  • Neuron does not fire: Output =0 , if Summation(Inputs*Weights) – Threshold  < 0 

Output

This is simply the output of the neuron which produces only binary values of 0 or 1. The value of 0 indicates that the neuron does not fire, the value of 1 indicates the neuron does fire.

How Perceptron Model works

After understanding the anatomy, let us now see how Perceptron (which is already trained) works.

  1. The input data is enabled into the Perceptron.
  2. Apply dot product between input & it’s respective weights and sum them up.
  3. Apply step function on above summation –
    • If output of step function >= 0 then perceptron is fired, output = 1
    • If output of step function < 0 then  perceptron does not fire, output = 0

Above process is visualized in below animation.

Perceptron (1957)
Perceptron (1957)

Training a Perceptron

The working we explained above is of a perceptron which is already trained on past data. Now the question arises how do we train a perceptron. Well a perceptron can be trained using Perceptron Learning Algorithm.

Before understanding this learning algorithm, let us first understand couple of related concepts.

Learning Parameters

Machine learning models have parameters whose optimal value it learns during the training phase.

In a Perceptron the parameters that are learned during training phase are –

  • Weights of input parameters
  • Threshold value – which is also incorporated as a weight with input value as 1 in perceptron architecture (see above)

So effectively we only need to train the weights (W0,W1,W2 etc) during the training phase.

Errors and Weight adjustment

During training phase, the weights are initialized randomly and with each data, the error is calculated and weights are adjusted accordingly so as to reduce the error.

As you have seen above, the possible output of perceptron is only binary i.e 0 or 1. So there are only four possible combination of actual value and output value.

Below chart shows this combination and will help you to get intuition how weights should be adjusted in case of error.

Perceptron - Error and Weight adjustments
Perceptron – Error and Weight adjustments

Perceptron Learning Algorithm

Let us first see the pseudo code of this algorithm with respect to our example in above animation.

      • Initialize all the weights randomly

Start Loop

For each data in dataset –

            1. Give input to Perceptron and get the output
            2. Calculate the Error = Actual – Output
            3. Adjust the weights of each input as –
              • Weight(X0) = Weight(X0) +  Error * Input(X0)
              • Weight(X1) = Weight(X1) +  Error * Input(X1)
              • Weight(X2) = Weight(X2) +  Error * Input(X2)

End For

Continue Loop while Error still exists

Intuition behind Perceptron Learning Algorithm

For most part of the pseduo code, things should be self explanatory. We are doing multiple loops over dataset and in each loop we are passing data one by one through perceptron, calculating error and adjusting weights. The loop is continued till there are no errors.

You might be scratching your head in the weights adjustment part. So let me explain the intuition behind it-

Actual =1 , Output =0, Error =1

In this case, since Output is less than actual, we would like to increase the weight to push the output towards actual. It requires a positive weight adjustment

Weight = Weight + Error*Input  = Weight + 1*Input (weight adjustment is positive)

Actual =0 , Output =1, Error = -1

In this case, since Output is more than actual, we would like to decrease the weight to pull back the output towards actual. It requires a negative weight adjustment

Weight = Weight + Error*Input  = Weight – 1*Input (weight adjustment is negative)

Actual =0 , Output =0, Error =0

In this case since Output and Actual is same, there is no error so zero weight adjustment is required.

Weight = Weight + Error*Input  = Weight – 0*Input = Weight  (no weight adjustment)

Actual =1 , Output =1, Error =0

In this case since Output and Actual is same, there is no error so zero weight adjustment is required.

Weight = Weight + Error*Input  = Weight – 0*Input = Weight  (no weight adjustment)

Some points to consider

  1. Perceptron addresses the three drawbacks that we discussed about McCulloch-Pitts neuron above. It can take real number as inputs and weights are also associated with inputs. This is also a true machine learning model which can be trained using past data.
  2. It still, however can produce only binary output. So perceptron can be used in case of binary classification problems only.
  3. Perceptron can classify only those data which are linearly separable. So it cannot solve a XOR problem.
  4. If the data is linearly separable, perceptron learning algorithm will always converge in finite number of loops.

In the End…

I hope it was both informative and entertaining read as usual. We have taken a next step in our series to jump from a very trivial McCulloch-Pitts neuron to a bit sophisticated perceptron that can actually learn from past experience.

Perceptron still has some drawbacks that we touched upon in above section. We will continue our series Neural Network Primitives to explore the roots of modern Deep Learning.

Do share your feed back about this post in the comments section below. If you found this post informative, then please do share this and subscribe to us by clicking on bell icon for quick notifications of new upcoming posts. And yes, don’t forget to join our new community MLK Hub and Make AI Simple together.

 

Follow Us

Leave a Reply

Your email address will not be published. Required fields are marked *