AN INTRODUCTION TO NEURAL NETWORKS AND CONVOLUTIONAL NEURAL NETWORKS

Md. Wazir Ali
4 min readJan 16, 2022

TABLE OF CONTENTS:-

INTRODUCTION

PERCEPTRON & SIGMOID

ACTIVATION FUNCTIONS

MULTI-LAYERED PERCEPTRONS

LOSS FUNCTION AND OPTIMIZERS

FORWARD PASS AND BACKPROPAGATION

INTRODUCTION TO CONVNETS

ADVANTAGES OF CONVNETS OVER MLPS

CONVOLUTION LAYER

POOLING LAYER

FULLY CONNECTED LAYER

PRETRAINED CONVOLUTIONAL NEURAL NETWORKS

CODES OF PRETRAINED CNNS IN KERAS

INTRODUCTION

A Neural Network is a network of connected neurons which receive inputs at one end and produce an output on the other end. A neuron is an analogy to a biological neuron in the human brain which is connected to other neurons via axons that carry electrical signals. Here is a picture of how a biological neuron look like.

A Biological Neuron

In human brain, we have billions of such neurons and our brain itself is a huge network of neurons. The brain controls the various actions in our body such as movement of hand, legs, vision, touch, smell etc. by transmission of electrical signals through these axons according to the input they receive through the dendrites which are also electrical signals.

An Artificial Neural Network finds it’s applications in the field of Artificial Intelligence and Machine Learning. These neural networks help to find complex mappings from input to outputs for various problems of classification as binary classification, multi-class classification, regression. When a neural network becomes a very deep one consisting of many layers of neurons, then it is known as a Deep Neural Network and the field of study is known as Deep Learning.

Today, Deep Learning is providing solution to problems of image recognition, object detection, image classification, text classification, speech recognition, image captioning, machine translation and many other tasks. These contribute to the fields of Computer Vision and Natural Language Processing and also at their intersection.

PERCEPTRON & SIGMOID

PERCEPTRON

A perceptron can be compared to a neuron which takes multiple inputs and provides a binary output of 0 or a 1. Below is a diagram for the same.

Image of a Perceptron

In Machine Learning, the Perceptron is a single layer neural network and is an algorithm for supervised learning of binary classifiers for linearly separable inputs.

A binary classifier is a function for determining whether a vector of inputs belong to some specific class or not. It works on the principle of thresholding and produces an output of 0 or 1 based on the result of the weighted sum of the inputs.

Each of the lines connecting the inputs x1, x2 , x3,…,xn to the perceptron contain some weight and the weighted sum of these inputs would be a number. On top of these inputs, there is a constant denoted by 1 which is also connected to the perceptron unit. This is known as the bias term.

The thresholding function gives a value of 1 if the weighted sum is greater than 0 which is usually considered a good threshold for the perceptron otherwise this function gives a value of 0. The weights are randomly initialized to some values and they converge to certain values after iterating over the inputs.

SIGMOID

A sigmoid neuron is a neuron with a non-linearity added on top of the summation of the weighted inputs and bias instead of a simple thresholding function that gives an output of 0 or 1 based on the fact if the input is less than or equal to 0 or not.

Mathematically, sigmoid function is represented by:-

Sigmoid Function

In the above picture, the quantity z is comparable to the weighted sum of inputs and the bias term in the above figure of the perceptron.

Thresholding Function

Step Function

The problem with the above thresholding function is that it is not continuous at z = 0 which prevents it from being differentiable.

Sigmoid Function

Sigmoid Function

As you can see from the above figure that the problem of discontinuity is gone which was evident from the step function. The graph starts flat and then starts it’s slope starts increasing slowly for values of z lesser than -3.5 and it becomes steep till the z value approaches around 3.5 and again flattens out. This function is continuous at all points of z and is differentiable. Let’s have a look at the perceptron with a sigmoid activation

A Perceptron with a sigmoid function

The derivative of the sigmoid function given by g(z) is given by g(z)*(1-g(z)).

--

--