🤖
AI Wiki
Gradient PlatformDocsGet Started FreeContact Sales
  • Artificial Intelligence Wiki
  • Topics
    • Accuracy and Loss
    • Activation Function
    • AI Chips for Training and Inference
    • Artifacts
    • Artificial General Intelligence (AGI)
    • AUC (Area under the ROC Curve)
    • Automated Machine Learning (AutoML)
    • CI/CD for Machine Learning
    • Comparison of ML Frameworks
    • Confusion Matrix
    • Containers
    • Convergence
    • Convolutional Neural Network (CNN)
    • Datasets and Machine Learning
    • Data Science vs Machine Learning vs Deep Learning
    • Distributed Training (TensorFlow, MPI, & Horovod)
    • Generative Adversarial Network (GAN)
    • Epochs, Batch Size, & Iterations
    • ETL
    • Features, Feature Engineering, & Feature Stores
    • Gradient Boosting
    • Gradient Descent
    • Hyperparameter Optimization
    • Interpretability
    • Jupyter Notebooks
    • Kubernetes
    • Linear Regression
    • Logistic Regression
    • Long Short-Term Memory (LSTM)
    • Machine Learning Operations (MLOps)
    • Managing Machine Learning Models
    • ML Showcase
    • Metrics in Machine Learning
    • Machine Learning Models Explained
    • Model Deployment (Inference)
    • Model Drift & Decay
    • Model Training
    • MNIST
    • Overfitting vs Underfitting
    • Random Forest
    • Recurrent Neural Network (RNN)
    • Reproducibility in Machine Learning
    • REST and gRPC
    • Serverless ML: FaaS and Lambda
    • Synthetic Data
    • Structured vs Unstructured Data
    • Supervised, Unsupervised, & Reinforcement Learning
    • TensorBoard
    • Tensor Processing Unit (TPU)
    • Transfer Learning
    • Weights and Biases
Powered by GitBook
On this page
  • How RNNs Differ from Vanilla Feed-forward Networks
  • Shortcomings of RNNs

Was this helpful?

  1. Topics

Recurrent Neural Network (RNN)

PreviousRandom ForestNextReproducibility in Machine Learning

Last updated 5 years ago

Was this helpful?

Recurrent Neural Networks (RNNs) are a type of artificial neural network that has a chain-like structure especially well-suited to operate on sequences and lists. RNNs are applied to a wide variety of problems where text, audio, video, and time series data is present. This may include speech recognition, detection of stock trading patterns, analysis of DNA sequences, language modeling, translation, image captioning, and more.

How RNNs Differ from Vanilla Feed-forward Networks

RNNs differ in that they retain information about the input previously received. They are networks with feedback loops that allow information to persist -- a trait that is analogous to short-term memory. This sequential memory is preserved in the recurrent network’s hidden state vector and represents the context based on the prior inputs and outputs. Unlike a feed-forward network, the same input may produce different outputs depending on the preceding inputs.

Shortcomings of RNNs

The concept is illustrated in the image below. The amount of color present in each step of the sequence reveals the pieces of information that persist over time.

Regular feed-forward networks such as only consider the current input. Consequently, they do not have any memory about what happened in the past. Therefore, they have trouble predicting what comes next in a sequence.

RNNs are inherently deficient at retaining information over long periods of time due to the infamous vanishing gradient problem in back-propagation. For example, a name of a character at the start of a paragraph of text may be forgotten towards the end. were invented to solve this problem -- they can discern key information, retain it over long periods of time, and then use this information when necessary much later on in the sequence.

CNNs
LSTMs
Source: Missing Link
Source: Michael Nguyen / Learned Vector