🤖
AI Wiki
Gradient PlatformDocsGet Started FreeContact Sales
  • Artificial Intelligence Wiki
  • Topics
    • Accuracy and Loss
    • Activation Function
    • AI Chips for Training and Inference
    • Artifacts
    • Artificial General Intelligence (AGI)
    • AUC (Area under the ROC Curve)
    • Automated Machine Learning (AutoML)
    • CI/CD for Machine Learning
    • Comparison of ML Frameworks
    • Confusion Matrix
    • Containers
    • Convergence
    • Convolutional Neural Network (CNN)
    • Datasets and Machine Learning
    • Data Science vs Machine Learning vs Deep Learning
    • Distributed Training (TensorFlow, MPI, & Horovod)
    • Generative Adversarial Network (GAN)
    • Epochs, Batch Size, & Iterations
    • ETL
    • Features, Feature Engineering, & Feature Stores
    • Gradient Boosting
    • Gradient Descent
    • Hyperparameter Optimization
    • Interpretability
    • Jupyter Notebooks
    • Kubernetes
    • Linear Regression
    • Logistic Regression
    • Long Short-Term Memory (LSTM)
    • Machine Learning Operations (MLOps)
    • Managing Machine Learning Models
    • ML Showcase
    • Metrics in Machine Learning
    • Machine Learning Models Explained
    • Model Deployment (Inference)
    • Model Drift & Decay
    • Model Training
    • MNIST
    • Overfitting vs Underfitting
    • Random Forest
    • Recurrent Neural Network (RNN)
    • Reproducibility in Machine Learning
    • REST and gRPC
    • Serverless ML: FaaS and Lambda
    • Synthetic Data
    • Structured vs Unstructured Data
    • Supervised, Unsupervised, & Reinforcement Learning
    • TensorBoard
    • Tensor Processing Unit (TPU)
    • Transfer Learning
    • Weights and Biases
Powered by GitBook
On this page
  • What is Transfer Learning?
  • An Example

Was this helpful?

  1. Topics

Transfer Learning

PreviousTensor Processing Unit (TPU)NextWeights and Biases

Last updated 5 years ago

Was this helpful?

What is Transfer Learning?

Transfer learning is an approach used to transfer information from one machine learning task to another. Practically speaking, a pre-trained model that was trained for one task is re-purposed as the starting point for a new task. As a result, great amounts of time and resources can be saved by transfer learning.

Creating complex models from scratch requires vast amounts of compute resources, data, and time. Transfer learning accelerates the process by leveraging commonalities between tasks (such as detecting edges in images) and applying those learning to a new task. Training time for a model can go from weeks to hours, making machine learning more commercially viable for many businesses.

Transfer learning is very popular in domains like computer vision and NLP where large amounts of data are needed to produce accurate models.

An Example

Let's say we have merely 1,000 images of an object we want to classify. If we take a pre-trained CNN such as ResNet-50 which was trained on millions of images, we can re-train the model on our small dataset and build a state-of-the-art model with minimal effort. In neural networks, this is achieved by removing the final layer in the existing model (called the "loss output" layer) and replacing it with a new layer for the intended prediction.

Source: Drivers of ML industrial success by Andrew Ng, 2016 NIPS