Recurrent Neural Network (RNN)

Recurrent Neural Networks (RNNs) are a type of artificial neural network that has a chain-like structure especially well-suited to operate on sequences and lists. RNNs are applied to a wide variety of problems where text, audio, video, and time series data is present. This may include speech recognition, detection of stock trading patterns, analysis of DNA sequences, language modeling, translation, image captioning, and more.

How RNNs Differ from Vanilla Feed-forward Networks

Regular feed-forward networks such as CNNs only consider the current input. Consequently, they do not have any memory about what happened in the past. Therefore, they have trouble predicting what comes next in a sequence.

RNNs differ in that they retain information about the input previously received. They are networks with feedback loops that allow information to persist -- a trait that is analogous to short-term memory. This sequential memory is preserved in the recurrent network’s hidden state vector and represents the context based on the prior inputs and outputs. Unlike a feed-forward network, the same input may produce different outputs depending on the preceding inputs.

Shortcomings of RNNs

RNNs are inherently deficient at retaining information over long periods of time due to the infamous vanishing gradient problem in back-propagation. For example, a name of a character at the start of a paragraph of text may be forgotten towards the end. LSTMs were invented to solve this problem -- they can discern key information, retain it over long periods of time, and then use this information when necessary much later on in the sequence.

The concept is illustrated in the image below. The amount of color present in each step of the sequence reveals the pieces of information that persist over time.

Last updated