Figure 6
The main building blocks of RNNs, LSTMs and GRUs. (a), (b) The main building block of RNN. The previous step information is added to the current input and passed through the activation function. The output is then used for the prediction and passed to the next timestep. (c) LSTM uses gating mechanisms (for details, see Appendix C4) to control the flow of information through the hidden state and has three gates. The forget gate controls how much information from the previous timestamp should be forgotten. The input gate controls how much of the new input should be added. The third gate passes the current cell state to the next timestep. (d) The GRU uses a similar mechanism to the LSTMs; however there are only two gates. |