PyTorch Activation Functions – ReLU, Leaky ReLU, Sigmoid, Tanh and Softmax

PyTorch Activation Functions - ReLU, Leaky ReLU, Sigmoid, Tanh and Softmax

Characteristics of good Activation Functions in Neural Network

There are many activation functions that can be used in neural networks. Before we take a look at the popular ones in Kera let us understand what is an ideal activation function.

  • Non-Linearity – Activation function should be able to add nonlinearity in neural networks especially in the neurons of hidden layers. This is because rarely you will see any real-world scenarios that can be explained with linear relationships.
  • Differentiable – The activation function should be differentiable. During the training phase, the neural network learns by back-propagating error from the output layer to hidden layers. The backpropagation algorithm uses the derivative of the activation function of neurons in hidden layers, to adjust their weights so that error in the next training epoch can be reduced.

Read more – Animated Guide to Activation Function in Neural Network

Types of PyTorch Activation Functions

In this section, we will see different types of activation layers available in PyTorch along with examples and their advantages and disadvantages.

The sigmoid activation function produces output in the range of 0 to 1 which is interpreted as the probability.

Advantages of Sigmoid Activation Function

  • The sigmoid activation function is both non-linear and differentiable which are good characteristics for activation function.
  • As its output ranges between 0 to 1, it can be used in the output layer to produce the result in probability for binary classification.

Disadvantages of Sigmoid Activation Function

  • Sigmoid activation is computationally slow and the neural network may not converge fast during training.
  • When the input values are too small or too high, it can cause the neural network to stop learning, this issue is known as the vanishing gradient problem. This is why the Sigmoid activation function should not be used in hidden layers.


Please enter your comment!
Please enter your name here