Brief History of Deep Learning from 1943-2019 [Timeline]

Ankur K.
Last Updated On October 17, 2020
AI News & Discussions, Deep Learning Basics

Introduction

The world right now is seeing a global AI revolution across all industry. And one of the driving factor of this AI revolution is Deep Learning. Thanks to giants like Google and Facebook, Deep Learning now has become a popular term and people might think that it is a recent discovery. But you might be surprise to know that history of deep learning dates back to 1940s.

Indeed, deep learning has not appeared overnight, rather it has evolved slowly and gradually over seven decades. And behind this evolution, there are many machine learning researchers who worked with great determination even when no one believed that neural networks have any future.

This is our humble attempt to take you through the history of deep learning to relive the key discoveries made by the researchers and how all these small baby steps contributed to the modern era of deep learning boom.

Deep Learning History Timeline

1943

McCulloch Pitts Neuron – Beginning

Walter Pitts and Warren McCulloch in their paper, “A Logical Calculus of the Ideas Immanent in Nervous Activity” shows the mathematical model of biological neuron. This McCulloch Pitts Neuron has very limited capability and has no learning mechanism. Yet it will lay the foundation for artificial neural network & deep learning.

1957

Frank Rosenblatt creates Perceptron

In his paper “The Perceptron: A Perceiving and Recognizing Automaton”, Rosenblatt shows the new avatar of McCulloch-Pitts neuron – ‘Perceptron’ that had true learning capabilities to do binary classification on it’s own. This inspires the revolution in research of shallow neural network for years to come, till first AI winter.

1960

The first Backpropagation Model

Henry J. Kelley in his paper, “Gradient Theory of Optimal Flight Paths” shows the first ever version of continuous backpropagation model. His model is in context to Control Theory, yet it lays the foundation for further refinement in the model and would be used in ANN in future years.

1962

Backpropagation with Chain Rule

Stuart Dreyfus in his paper, “The numerical solution of variational problems” shows a backpropagation model that uses simple derivative chain rule, instead of dynamic programming which earlier backpropagation models were using. This is yet another small step that strengthens the future of deep learning.

1965

Birth of Multilayer Neural Network

Alexey Grigoryevich Ivakhnenko along with Valentin Grigorʹevich Lapa, creates hierarchical representation of neural network that uses polynomial activation function and are trained using Group Method of Data Handling (GMDH). It is now considered as the first ever multi-layer perceptron and Ivakhnenko is often considered as father of deep learning.

1969

The Fall of Perceptron

Marvin Minsky and Seymour Papert publishes the book “Perceptrons” in which they show that Rosenblatt’s perceptron cannot solve complicated functions like XOR. For such function perceptrons should be placed in multiple hidden layers which compromises perceptron learning algorithm. This setback triggers the winter of neural network research.

1970

Backpropagation is computer coded

Seppo Linnainmaa publishes general method for automatic differentiation for backpropagation and also implements backpropagation in computer code. The research in backpropagation has now come very far, yet it would not be implemented in neural network till next decade.

1971

Neural Network goes Deep

Alexey Grigoryevich Ivakhnenko continues his research in Neural Network. He creates 8-layer Deep neural network using Group Method of Data Handling (GMDH).

1980

Neocognitron – First CNN Architecture

Kunihiko Fukushima comes up with Neocognitron, the first convolutional neural network architecture which could recognize visual patterns such as handwritten characters.

1982

Hopfield Network – Early RNN

John Hopfield creates Hopfield Network, which is nothing but a recurrent neural network. It serves as a content-addressable memory system, and would be instrumental for further RNN models of modern deep learning era.

1982

Proposal for Backpropagation in ANN

Paul Werbos, based on his 1974 Ph.D. thesis, publicly proposes the use of Backpropagation for propagating errors during the training of Neural Networks. His results of the Ph.D. thesis will eventually lead to the practical adoption of backpropagation by the neural network community in the future.

1985

Boltzmann Machine

David H. Ackley, Geoffrey Hinton and Terrence Sejnowski create Boltzmann Machine that is a stochastic recurrent neural network. This neural network has only input layer and hidden layer but no output layer.

1986

NetTalk – ANN Learns Speech

Terry Sejnowski creates NeTalk, a neural network which learns to pronounce written English text by being shown text as input and matching phonetic transcriptions for comparison.

1986

Implementation of Backpropagation

Geoffrey Hinton, Rumelhart, and Williams in their paper “Learning Representations by back-propagating errors” show the successful implementation of backpropagation in the neural network. It opened gates for training complex deep neural network easily which was the main obstruction in earlier days of research in this area.

1986

Restricted Boltzmann Machine

Paul Smolensky comes up with a variation of Boltzmann Machine where there is not intra layer connection in input and hidden layer. It is known as Restricted Boltzmann Machine (RBM). It would become popular in years to come especially for building recommender systems.

1989

CNN using Backpropagation

Yann LeCun uses backpropagation to train convolutional neural network to recognize handwritten digits. This is a breakthrough moment as it lays the foundation of modern computer vision using deep learning.

1989

Universal Approximators Theorem

George Cybenko publishes earliest version of the Universal Approximation Theorem in his paper “Approximation by superpositions of a sigmoidal function“. He proves that feed forward neural network with single hidden layer containing finite number of neurons can approximate any continuous function. It further adds credibility to Deep Learning.

1991

Vanishing Gradient Problem Appears

Sepp Hochreiter identifies the problem of vanishing gradient which can make the learning of deep neural network extremely slow and almost impractical. This problem will continue to annoy deep learning community for many more years to come.

1997

The Milestone of LSTM

Sepp Hochreiter and Jürgen Schmidhuber publishes a milestone paper on “Long Short-Term Memory” (LSTM). It is a type of recurrent neural network architecture which will go on to revolutionize deep learning in decades to come.

2006

Deep Belief Network

Geoffrey Hinton, Ruslan Salakhutdinov, Osindero and Teh publishes the paper “A fast learning algorithm for deep belief nets” in which they stacked multiple RBMs together in layers and called them Deep Belief Networks. The training process is much more efficient for large amount of data.

2008

GPU Revolution Begins

Andrew NG’s group in Stanford starts advocating for the use of GPUs for training Deep Neural Networks to speed up the training time by many folds. This could bring practicality in the field of Deep Learning for training on huge volume of data efficiently.

2009

ImageNet is launched

Finding enough labeled data has always been a challenge for Deep Learning community. In 2009 Fei-Fei Li, a professor at Stanford, launches ImageNet which is a database of 14 million labeled images. It would serve as a benchmark for the deep learning researchers who would participate in ImageNet competitions (ILSVRC) every year.

2011

Combat for vanishing gradient

Yoshua Bengio, Antoine Bordes, Xavier Glorot in their paper “Deep Sparse Rectifier Neural Networks” shows that ReLU activation function can avoid vanishing gradient problem. This means that now, apart from GPU, deep learning community has another tool to avoid issues of longer and impractical training times of deep neural network.

2012

AlexNet Starts Deep Learning Boom

AlexNet, a GPU implemented CNN model designed by Alex Krizhevsky, wins Imagenet’s image classification contest with accuracy of 84%. It is a huge jump over 75% accuracy that earlier models had achieved. This win triggers a new deep learning boom globally.

2014

The birth of GANs

Generative Adversarial Neural Network also known as GAN is created by Ian Goodfellow. GANs open a whole new doors of application of deep learning in fashion, art, science due it’s ability to synthesize real like data.

2016

AlphaGo beats human

Deepmind’s deep reinforcement learning model beats human champion in the complex game of Go. The game is much more complex than chess, so this feat captures the imagination of everyone and takes the promise of deep learning to whole new level.

2019

Trio win Turing Award

Yoshua Bengio, Geoffrey Hinton, and Yann LeCun wins Turing Award 2018 for their immense contribution in advancements in area of deep learning and artificial intelligence. This is a defining moment for those who had worked relentlessly on neural networks when entire machine learning community had moved away from it in 1970s.

[adrotate banner=”3″]

Disclaimer-

There would be countless researchers whose results, directly or indirectly, would have contributed to the emergence and boom of deep learning. This article only attempts to discover a brief history of deep learning by highlighting some key moments and events. Efforts have been made to reproduce the chronological events of deep learning history as accurately as possible. If you have any concerns or feedback, then please do write to us.

Sources-

https://news.cornell.edu/stories/2019/09/professors-perceptron-paved-way-ai-60-years-too-soon
https://en.wikipedia.org/wiki/Frank_Rosenblatthttps://en.wikipedia.org/wiki/Perceptron
http://alchessmist.blogspot.com/2009/06/stuart-dreyfus-on-mathematics-chess.html

https://www.sciencedirect.com/science/article/pii/0022247X62900045?via%3Dihub
https://en.wikipedia.org/wiki/Backpropagation
https://www.gwern.net/docs/statistics/decision/1960-kelley.pdf

https://en.wikipedia.org/wiki/AI_winter
http://beamandrew.github.io/deeplearning/2017/02/23/deep_learning_101_part1.html
https://mailman.srv.cs.cmu.edu/pipermail/connectionists/2014-July/027158.html

https://en.wikipedia.org/wiki/Alexey_Ivakhnenko
https://www.abebooks.com/Perceptrons-Introduction-Computational-Geometry-Marvin-Minsky/30050854532/bd
http://people.idsia.ch/~juergen/linnainmaa1970thesis.pdf

http://personalpage.flsi.or.jp/fukushima/index-e.html
https://en.wikipedia.org/wiki/Convolutional_neural_network#History
https://bulletin.swarthmore.edu/bulletin-issue-archive/index.html%3Fp=336.html

https://en.wikipedia.org/wiki/Hopfield_network
http://www.iro.umontreal.ca/~vincentp/ift3395/lectures/backprop_old.pdf
http://www.andreykurenkov.com/writing/ai/a-brief-history-of-neural-nets-and-deep-learning/

http://www.cs.toronto.edu/~hinton/absps/cogscibm.pdf
http://www.scholarpedia.org/article/Boltzmann_machine
https://medium.com/@tanaykarmarkar/explainable-restricted-boltzmann-machine-for-collaborative-filtering-6f011035352d

https://link.springer.com/article/10.1007%2FBF02551274
https://en.wikipedia.org/wiki/Universal_approximation_theorem#
https://en.wikipedia.org/wiki/J%C3%BCrgen_Schmidhuber

https://en.wikipedia.org/wiki/Sepp_Hochreiter
http://people.idsia.ch/~juergen/
https://slideslive.com/38906590/deep-learning-is-revolutionizing-artificial-intelligence

http://www.cs.toronto.edu/~hinton/absps/fastnc.pdf
https://en.wikipedia.org/wiki/Deep_belief_network
https://www.quora.com/What-does-Andrew-Ng-think-about-Deep-Learning

https://qz.com/1307091/the-inside-story-of-how-ai-got-good-enough-to-dominate-silicon-valley/
https://en.wikipedia.org/wiki/AlexNethttps://papers.nips.cc/paper/5423-generative-adversarial-nets.pdfhttp://proceedings.mlr.press/v15/glorot11a/glorot11a.pdf

Ankur K.

I am a Data Architect by profession and like writing tech articles on AI/ML
View all posts

Tags: AI, artificial intelligence, deep learning, discussions

Brief History of Deep Learning from 1943-2019 [Timeline]

Introduction

Deep Learning History Timeline

McCulloch Pitts Neuron – Beginning

Frank Rosenblatt creates Perceptron

The first Backpropagation Model

Backpropagation with Chain Rule

Birth of Multilayer Neural Network

The Fall of Perceptron

Backpropagation is computer coded

Neural Network goes Deep

Neocognitron – First CNN Architecture

Hopfield Network – Early RNN

Proposal for Backpropagation in ANN

Boltzmann Machine

NetTalk – ANN Learns Speech

Implementation of Backpropagation

Restricted Boltzmann Machine

CNN using Backpropagation

Universal Approximators Theorem

Vanishing Gradient Problem Appears

The Milestone of LSTM

Deep Belief Network

GPU Revolution Begins

ImageNet is launched

Combat for vanishing gradient

AlexNet Starts Deep Learning Boom

The birth of GANs

AlphaGo beats human

Trio win Turing Award

Leave a Reply Cancel reply

Latest Posts

Follow US