Let us call the layers as: Input -> Conv1 -> Conv2 -> MaxPool -> FC1 -> FC2 -> softmax
1. Conv1:
Kernel size = 8x8x1 (1 because the depth of input layer(input) is 1 (84x84x1))
No. of filters = 16
Stride = 4x4
=> no. of weights = (kernel size * no. of filters) = 8*8*1*16 = 1024
no. of bias is always equal to number of filters
=> no. of bias = 16
########################
Now, (we take only integer part)
I => length of input layer
F => filter/kernel size
P => padding (valid implies there is no padding i.e. P=0)
S => stride
#########################
So for conv1, output length/width = ((84 - 8 + 2*0) / 4) + 1 = 20
=> output shape = 20x20x16 (third dimension is number of filters)
##############################################################################
Conv2:
For conv2, input is output of conv1 => input dimensions = 20x20x16 => I=20
Kernel size = 4x4x16 (16 because the depth of input layer(conv1) is 16 (20x20x16))
No. of filters = 32
Stride = 2x2
=> no. of weights = (kernel size * no. of filters) = 4*4*16*32 = 8192
no. of bias is always equal to number of filters
=> no. of bias = 32
output length/width = ((20 - 4 +2*0) / 2) + 1 = 9
=> output shape = 9x9x32 (third dimension is number of filters)
##############################################################################
Now output of conv2 is input for maxpool layer => I = 9
MaxPool:
In case of max pooling there is no parameters
=> no. of weights = no. of bias = 0
Kernel size = 2x2
Stride = 2x2
In case of max pool, (only integer part)
=> output length/width = ((9 - 2) / 2) + 1 = 4.5 = 4 (only integer part)
=> output shape = 4x4x32 (depth remain same as input in case of maxpool)
##############################################################################
Now output of maxpool is input for FC1 => input shape = 4x4x32
In case of FC layer, all the neurons of input are connected to all neurons of output layer.
FC1:
=> all 4x4x32 = 512 neurons are connected to all neurons of FC1 layer i.e. 128 using weight for each connection.
=> no.of weights = 4*4*32*128 = 65536
no. of biases in FC layers if ewual to no. of neurons in FC layer as each neuron in FC layer has 1 bias.
=> no. of biases = 128
Output shape of FC1 = 128x1 or simply a vector of length 128 [128,]
###############################################################################
FC2:
input dimensions = dimensions of output of FC1 i.e. [128]
no. of weights = 128*4 = 512
no.of bias = 4
output shape = vector of length 4 [4,]
###############################################################################
Softmax:
now softmax doesn't have any weights or biases
It simply takes exponential of all the inputs and then divide them by sum of all exponentials.
So output shape is same as input shape
=> no.of weights = no. of bias = 0
=> output shape = vector of length 4 [4,]
###############################################################################
Now total number of weights in the network = 1024 + 8192 + 0 + 65536 + 512 +0 = 75264
Total no. of bias = 16 + 32 + 0 + 128 + 4 + 0 = 180
################################################################################
C part:
For calculating the receptive field, we need to calculate 3 things:
1. output size of each previous layer
2. jump of each layer
,
Here, Jin => jump of input layer
Jout => jump of current layer or output
S => stride of current layer
3. receptive field of previous layers
Rin => receptive field of input/previous layer
Rout => receptive field of current/output layer
J_{in} = J_{out} * SF => kernel/filter size
##############################################
For input:
R = 1, J = 1, I = Image size = 84
Now for the question:
I. Conv1:
output length/width = 20 (calculated before)
J(Conv1) = 1 * 4 = 4
R(Conv1) = R(image) + (8 - 1)*J(image) = 1 + 7*1 = 8
II. Conv2;
output length/width = 9 (calculated before)
J(Conv2) = 4 * 2 = 8
R(Conv2) = R(Conv1) + (4 - 1) * J(Conv1) = 8 + 3*4 = 20
III. MaxPool:
output length/width = 4 (calculated before)
J(MaxPool) = 8 * 2 = 16
R(MaxPool) = R(Conv2) + (2 - 1) * J(Conv2) = 20 + 0*8 = 28
Therefore size of receptive field after maxpool = 28 x 28
################################################################################
For a 2-D convolutional neural network in the following figure softmax Fully connected 4 Fully connected...
9. (10 pts) Consider a convolutional neural network whose inputs are an RGB color image of size 39 x 39 (i.e., the input volume size is 39 x 39 x 3). The network has three convolutional layers and the parameters of each convolutional layer are given in Figure 1. Compute the output volume size of the feature maps at each convolutional layer. Conv1 Conv2 Conv3 Filter size = 3 X 3 # filters = 10 Stride = 1 Padding =...
Let's design a convolutional neural network together. Suppose the size of the input image is 32-by-32-by-1 a) The first layer is a convolutional layer. The size of a filter is 7-by-7-by-X. What is the number for X? b) Given a., what is the size of the one feature map (activation map)? Note that we do not pad zeros around the input image and stride -1. c) Suppose we use 32 filters in a. How many feature maps are there after...
Draw a fully connected neural network with 1 hidden layer where the number of units input, hidden layer, and output layer are 3, 2, 1, respectively. . (5+5+5+5) a. Show all the weight matrices and their dimensions for this neural network. b. Label the network connections using the weight values (e.g., w12, w23). c. Total how many weights do you need to train in this neural network? . Explain supervised and unsupervised learning in your own words. (10) Draw a...
A deep learning problem. The following matrices describing a neural network were uncovered by scientists. The weights for the hidden layer are given in the matrix W[1] = [0 1] The bias for the hidden layer is given in the vector b[1] = [1] The weights for the output layer are given in the vector W[2] [8] 0 1 The biases for the output layer are 612] = -0.5 0.75 The input X is given in the vector X 1.25...
will give thumbs up to 3/5 answers to question Select all reasonable methods for handling local minima when training an ANN (Artificial Neural Networks): restart the training several times from the same initial state use simulated annealing perturb the weight matrix slightly and continue the training use a momentum term use full gradient descent add an additional hidden layer Select all that are true in regard to the hidden units of a fully-connected ANN: unlike decision tree nodes, ANN nodes...