Convolutional Neural Networks (CNNs) are a class of artificial neural networks that have demonstrated exceptional effectiveness in various computer vision tasks, such as image classification, object detection, and image segmentation.
- Input Layer
- Convolutional Layers
- Activation Functions
- Pooling (Subsampling) Layers
- Fully Connected (Dense) Layers
- Softmax Layer (Output Layer)
- Loss Function
- Backpropagation
- Training
- Regularization and Normalization
-
Input Layer: The input to a CNN is usually a two-dimensional grid of values representing an image. Each value in the grid corresponds to the intensity of a pixel in the image, and in the case of color images, there are typically three channels for the Red, Green, and Blue (RGB) color information.
-
Convolutional Layers: Convolutional layers are the core building blocks of a CNN. These layers consist of multiple filters (also known as kernels) that slide or "convolve" across the input image. Each filter extracts features from the input by performing a dot product between its weights and the values in a small region of the input. Convolutional layers are responsible for capturing various features in the input data, such as edges, textures, and patterns.
-
Activation Functions: After each convolution operation, an activation function like ReLU (Rectified Linear Unit) is applied element-wise. ReLU introduces non-linearity into the network, helping it learn complex relationships in the data.
-
Pooling (Subsampling) Layers: Pooling layers reduce the spatial dimensions (width and height) of the feature maps obtained from the convolutional layers. Common pooling methods include max pooling and average pooling, which downsample the feature maps by selecting the maximum or average value within a small window. Pooling helps reduce the computational load, increase the network's translational invariance, and make the network more robust.