Figure 6
Three examples of CNN representations at the first five convolutional layers of AlexNet. The network detects edges from pixels in the first layer, then uses those edges to detect shapes in the next layer, and then uses that result to infer complex shapes and objects in later layers. Thus, despite growing fuzziness, convolutional layers continue to maintain photographically accurate representations of input images. (Best viewed in colour and enlarged.) |