Taking the Coursera Deep Learning Specialization, Convolutional Neural Networks course. Will post condensed notes every week as part of the review process. All material originates from the free Coursera course, taught by Andrew Ng. See deeplearning.ai for more details.

Foundations of Convolutional Neural Networks

Convolutional Neural Networks

Computer Vision

• image classification
• object detection in images
• neural network style transfer

Edge Detection Example

Convolution is when you ‘map’ a kernel or filter matrix over your original matrix. Starting from the top left, element multiply the filter with the original matrix. Add all of these new elements.

More Edge Detection

Can use other types of filters.

You can make the neural network learn about the filter through backpropagation by treating the filter as a bunch of parameters to be learned.

• padding solves shrinking output and underutiliziation of edge and corner pixels

• typically pad values are 0
• In above example, padding $p=1$.

$$n + 2p - f + 1 \text{ by } n + 2p - f + 1$$

How much to pad? Valid and Same convolutions.

• “Valid”: $(n * n) \text{ convlution } (f * f) \rightarrow (n - f + 1 * n - f + 1)$
• “Same” Pad so that the output size is the same as the input size.

$$(n + 2p - f + 1 * n + 2p - f + 1)$$ $p = \dfrac{f-1}{2}$ $f$ is usually odd.

Strided Convolutions

• striding is the act of skipping over a number of cells during convolution.
• default case is stride of 1, where you move the filter one cell at a time.

• in the case that the stride puts the filter such that it hangs off of the original dimensions, convention is we simply don’t use apply it. (round down)

Convolutions Over Volume

• Same operation as a single layer convolution, except both the filter and the input now have multiple channels.

• each cell of filter, multiply by each cell of input. output is the sum of all these values.
• That is how a 6x6x3 * 3x3x3 becomes a 4x4x1.

• To handle multiple filters, you simply stack the results together.

One Layer of a Convolutional Network

• bias for convolutional layer is always a real number

Pooling Layers

• can also do averages
• for multiple layers, simply apply the same operation on the said layer
• nothing to learn (no parameters)

Why Convolutions?

• Convolutions allow you to reduce the number of parameters to train
• Parameter Sharing
• parameters are shared across the entire input
• Sparsity of Connections
• each output value depend on a small number of input values