View on GitHub

MNIST

MNIST Digits Classification with numpy only

by Valentyn Sichkar Academia.edu YouTube

MNIST Digits Classification with `numpy` only

Example on Digits Classification with the help of MNIST dataset of handwritten digits and Convolutional Neural Network.

Test online here

Content

Short description of the content. Full codes you can find inside the course by link above:

MNIST Digits Classification with numpy only
- Loading MNIST dataset
- Plotting examples of digits from MNIST dataset
- Preprocessing loaded MNIST dataset
- Saving and Loading serialized models
- Functions for dealing with CNN layers
  - Naive Forward Pass for Convolutional layer
  - Naive Backward Pass for Convolutional layer
  - Naive Forward Pass for Max Pooling layer
  - Naive Backward Pass for Max Pooling layer
  - Forward Pass for Affine layer
  - Backward Pass for Affine layer
  - Forward Pass for ReLU layer
  - Backward Pass for ReLU layer
  - Softmax Classification loss
- Creating Classifier - model of CNN
  - Initializing new Network
  - Evaluating loss for training ConvNet1
  - Calculating scores for predicting ConvNet1
- Functions for Optimization
  - Vanilla SGD
- Creating Solver Class
  - _Reset
  - _Step
  - Checking Accuracy
  - Train
- Overfitting Small Data
- Training Results
- Full Codes

MNIST Digits Classification with `numpy` only

In this example we’ll test CNN for Digits Classification with the help of MNIST dataset.
Following standard and most common parameters can be used and tested:

Parameter	Description
Weights Initialization	HE Normal
Weights Update Policy	Vanilla SGD, Momentum SGD, RMSProp, Adam
Activation Functions	ReLU, Sigmoid
Regularization	L2, Dropout
Pooling	Max, Average
Loss Functions	Softmax, SVM

Contractions:

Vanilla SGD - Vanilla Stochastic Gradient Descent
Momentum SGD - Stochastic Gradient Descent with Momentum
RMSProp - Root Mean Square Propagation
Adam - Adaptive Moment Estimation
SVM - Support Vector Machine

For current example following architecture will be used:
Input –> Conv –> ReLU –> Pool –> Affine –> ReLU –> Affine –> Softmax

For current example following parameters will be used:

Parameter	Description
Weights Initialization	`HE Normal`
Weights Update Policy	`Vanilla SGD`
Activation Functions	`ReLU`
Regularization	`L2`
Pooling	`Max`
Loss Functions	`Softmax`

Loading MNIST dataset

After downloading files from official resource, there has to be following files:

train-images-idx3-ubyte.gz
train-labels-idx1-ubyte.gz
t10k-images-idx3-ubyte.gz
t10k-labels-idx1-ubyte.gz

Plotting examples of digits from MNIST dataset

After dataset was load, it is possible to show examples of training images.

Result can be seen on the image below.

MNIST_examples

Preprocessing loaded MNIST dataset

Next, creating function for preprocessing MNIST dataset for further use in classifier.

Normalizing data by dividing / 255.0 (!) - up to researcher
Normalizing data by subtracting mean image and dividing by standard deviation (!) - up to researcher
Transposing every dataset to make channels come first
Returning result as dictionary

As a result there will be following:

x_train: (59000, 1, 28, 28)
y_train: (59000,)
x_validation: (1000, 1, 28, 28)
y_validation: (1000,)
x_test: (1000, 1, 28, 28)
y_test: (1000,)

Saving and Loading serialized models

Saving loaded and preprocessed data into ‘pickle’ file.

Functions for dealing with CNN layers

Creating functions for CNN layers:

Naive Forward Pass for Convolutional layer
Naive Backward Pass for Convolutional layer
Naive Forward Pass for Max Pooling layer
Naive Backward Pass for Max Pooling layer
Forward Pass for Affine layer
Backward Pass for Affine layer
Forward Pass for ReLU layer
Backward Pass for ReLU layer
Softmax Classification loss

Creating Classifier - model of CNN

Creating model of CNN Classifier:

Creating class for ConvNet1
Initializing new Network
Evaluating loss for training ConvNet1
Calculating scores for predicting ConvNet1

Defining Functions for Optimization

Using different types of optimization rules to update parameters of the Model.

Vanilla SGD updating method

Rule for updating parameters is as following:

Vanilla SGD

Creating Solver Class

Creating Solver class for training classification models and for predicting:

Creating and Initializing class for Solver
Creating ‘reset’ function for defining variables for optimization
Creating function ‘step’ for making single gradient update
Creating function for checking accuracy of the model on the current provided data
Creating function for training the model