Combat for vanishing gradient - MLK - Machine Learning Knowledge

Yoshua Bengio, Antoine Bordes, Xavier Glorot in their paper “Deep Sparse Rectifier Neural Networks” shows that ReLU activation function can avoid vanishing gradient problem. This means that now, apart from GPU, deep learning community has another tool to avoid issues of longer and impractical training times of deep neural network.

MLK is a knowledge sharing platform for machine learning enthusiasts, beginners, and experts. Some links in our website may be affiliate links which means if you make any purchase through them we earn a little commission on it. This helps us to sustain the operation of our website and continue to bring new and quality Machine Learning contents for you.

Follow US