Figure 9
The classical (a) and residual (b), (c) mapping of the input. (a) In a classical CNN, the network directly learns input to the output mapping function, passing the result to the activation function. (b), (c) In ResNets the network primarily learns the residual or difference between the input and the output, as opposed to the complete output mapping, either by identity mapping or by a so-called 1 × 1 convolution that collapses one or more dimensions. This approach enables each layer to capture additional nuances without losing the information learned by previous layers. |