Contents
- 1 Introduction
- 2 MNIST Handwritten Digit Dataset
- 3 Setting up Google Colab
- 4 Image Classification using Deep Neural Network with Keras
- 4.1 Importing required libraries
- 4.2 Read the CSV files using Pandas
- 4.3 Reading Image from MNIST Dataset
- 4.4 Data Preprocessing
- 4.5 Split Training set into Train and Validation set
- 4.6 Deep Neural Network Model Architecture
- 4.7 Implementation of Deep Neural Network with Keras
- 4.8 Early Stopping
- 4.9 Training of Model
- 4.10 Test the Model
- 5 Conclusion
Introduction
In this article, we will learn image classification with Keras using deep learning. We will not use the convolutional neural network but just a simple deep neural network which will still show very good accuracy. For this purpose, we will use the MNIST handwritten digits dataset which is often considered as the Hello World of deep learning tutorials. And since deep learning models are trained fast on GPUs, we will use Google Colab for building our model.
- Read More – Image Classification using Bag of Visual Words Model
- Read More – Keras Implementation of VGG16 Architecture from Scratch
Before we do the actual hands-on, let us first understand MNIST dataset
MNIST Handwritten Digit Dataset

- The MNIST handwritten digit classification problem is a standard dataset used in computer vision and deep learning tasks. It is a dataset of 60,000 small square 28×28 pixel grayscale images of handwritten single digits between 0 and 9.
- The task is to classify a given image of a handwritten digit into one of 10 classes representing integer values from 0 to 9.
- I have downloaded the dataset from Kaggle to show how you can use your own dataset to trained your model.
- In Kaggle the dataset contains two files train.csv and test.csv.The data files train.csv and test.csv contain gray-scale images of hand-drawn digits, from zero through nine.
- Each image is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total. Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker. This pixel-value is an integer between 0 and 255, inclusive.
- The training data set, (train.csv), has 785 columns. The first column, called “label”, is the digit that was drawn by the user. The rest of the columns contain the pixel-values of the associated image.
Setting up Google Colab

We have uploaded the dataset on our google drive and we need to mount the google drive directory on our runtime Jupyter environment as shown below. This command will generate a URL on which you need to click, authenticate our Google drive account and copy the authorization key over here and press enter.
from google.colab import drive
drive.mount('/content/gdrive')
Now that we have set up the Google Colab let us start with the actual building of image classification model with Keras.
Image Classification using Deep Neural Network with Keras
Importing required libraries
import cv2
import numpy as np
import os
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split
Read the CSV files using Pandas
train_path="/content/gdrive/My Drive/train.csv"
test_path="/content/gdrive/My Drive/test.csv"
train = pd.read_csv(train_path)
print(train.shape)
train.head()
Similarly, here in test dataframe there are 28000 rows and 784 columns. We don’t have a label column present here since it has to be predicted by our image classification model.
In [6]:
test= pd.read_csv(test_path)
print(test.shape)
test.head()
Data Preprocessing
X_train = (train.iloc[:,1:].values).astype('float32')
y_train = train.iloc[:,0].values.astype('int32')
X_test = test.values.astype('float32')
X_train
X_train.shape
y_train
As you can see above that y_train has integers and we will convert it into one-hot encoding. In one-hot encoding, the integers are represented as vectors as follows –
1 – [1,0,0,0,0,0,0,0,0,0]
2 – [0,1,0,0,0,0,0,0,0,0]
3 – [0,0,1,0,0,0,0,0,0,0]
4 – [0,0,0,1,0,0,0,0,0,0] .. and so on.
We will use Keras built-in function to_categorical() to perform one-hot encoding.
from keras.utils.np_utils import to_categorical
y_train= to_categorical(y_train)
Now the final step in our preprocessing is data normalization where we transform the grayscale values of 0-255 to the range of 0-1 only. This will help to create a good model that converges fast.
X_train=X_train/255.0
X_test=X_test/255.0
Split Training set into Train and Validation set
As a good practice, it is better to split training set into train and validation set. The validation set is used during the training of neural network.
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.10, random_state=42)
Deep Neural Network Model Architecture

- Here the input layer has 784 neurons corresponding to each pixel of the image.
- There are two hidden layers with 100 and 50 neurons and they have ReLU activation functions.
- The output layer has 10 neurons corresponding to 1-10 numerical digits and has softmax activation function.
- There are no rules of how many hidden layers and neurons a neural network should have. This comes from trial and experimentations. The architecture that we have chosen has given us good results for image classification with MNIST handwritten digits dataset.
- Also Read – Animated Explanation of Feed Forward Neural Network Architecture
- Also Read – Animated guide to Activation Functions in Neural Network
Implementation of Deep Neural Network with Keras
To create the neural network model we have to import the following modules from Keras libraries.
from keras.preprocessing.image import ImageDataGenerator
from keras import Sequential
from keras.layers import Dense,Flatten,Conv2D,MaxPooling2D,Activation,Dropout
from keras import backend as K
from keras.callbacks import EarlyStopping,ModelCheckpoint
from keras.optimizers import RMSprop
K.tensorflow_backend._get_available_gpus()
classifier = Sequential()
classifier.add(Dense(100, activation='relu', kernel_initializer='random_normal',input_shape=(784,),name='first_layer'))#Second Hidden Layer
classifier.add(Dense(50, activation='relu', kernel_initializer='random_normal',name='second_layer'))#Output Layer
classifier.add(Dense(10, activation='softmax'))
Before making network ready for training we have to compile the network by adding below things:
- Loss function: It measures by how much the prediction of neural network is deviating from the actual output. We use loss function as categorical cross-entropy.
- Optimizer: It updates the network parameters to reduce the error from loss function in every training iteration. We use RMSprop optimizer with learning rate as 0.001
- Metrics: It monitors the performance of the network and we choose Accuracy as the metric.
classifier.compile(optimizer=RMSprop(lr=0.001),loss='categorical_crossentropy',metrics=['accuracy'])
Early Stopping
A major challenge in training neural networks is how long to train them.
Too little training will mean that the model will underfit the train and the test sets. Too much training will mean that the model will overfit the training dataset and have poor performance on the test set.
So to deal with this problem we use early stopping here by saving the best weights of the model during the training phase.
mc = ModelCheckpoint('best_model.h5', monitor='val_accuracy', mode='max', save_best_only=True)
es=EarlyStopping(monitor='val_accuracy', mode='max', verbose=1, patience=20)
Training of Model
For training, we are using the fit method of Keras. Here we are passing X_train,y_train, and validation (X_val,y_val) and for early stopping callbacks argument is passed. We will run it fro 50 epochs.
classifier.fit(X_train,y_train,validation_data=(X_val,y_val),epochs=50,shuffle=True,callbacks=[mc,es])
classifier.evaluate(X_val,y_val)