🤖
AI Wiki
Gradient PlatformDocsGet Started FreeContact Sales
  • Artificial Intelligence Wiki
  • Topics
    • Accuracy and Loss
    • Activation Function
    • AI Chips for Training and Inference
    • Artifacts
    • Artificial General Intelligence (AGI)
    • AUC (Area under the ROC Curve)
    • Automated Machine Learning (AutoML)
    • CI/CD for Machine Learning
    • Comparison of ML Frameworks
    • Confusion Matrix
    • Containers
    • Convergence
    • Convolutional Neural Network (CNN)
    • Datasets and Machine Learning
    • Data Science vs Machine Learning vs Deep Learning
    • Distributed Training (TensorFlow, MPI, & Horovod)
    • Generative Adversarial Network (GAN)
    • Epochs, Batch Size, & Iterations
    • ETL
    • Features, Feature Engineering, & Feature Stores
    • Gradient Boosting
    • Gradient Descent
    • Hyperparameter Optimization
    • Interpretability
    • Jupyter Notebooks
    • Kubernetes
    • Linear Regression
    • Logistic Regression
    • Long Short-Term Memory (LSTM)
    • Machine Learning Operations (MLOps)
    • Managing Machine Learning Models
    • ML Showcase
    • Metrics in Machine Learning
    • Machine Learning Models Explained
    • Model Deployment (Inference)
    • Model Drift & Decay
    • Model Training
    • MNIST
    • Overfitting vs Underfitting
    • Random Forest
    • Recurrent Neural Network (RNN)
    • Reproducibility in Machine Learning
    • REST and gRPC
    • Serverless ML: FaaS and Lambda
    • Synthetic Data
    • Structured vs Unstructured Data
    • Supervised, Unsupervised, & Reinforcement Learning
    • TensorBoard
    • Tensor Processing Unit (TPU)
    • Transfer Learning
    • Weights and Biases
Powered by GitBook
On this page

Was this helpful?

  1. Topics

Synthetic Data

PreviousServerless ML: FaaS and LambdaNextStructured vs Unstructured Data

Last updated 5 years ago

Was this helpful?

. The aphorism is a bit cliche but it is true that the tech giants have benefited disproportionately from AI which is due in no small part to the amount of data they collect.

Companies that are not Google, Facebook, Amazon et al. often do not have enough data to train models accurately -- especially in the case of training deep neural networks that require more data than classical machine learning algorithms.

Creation of fake data, called synthetic data, is one way of overcoming the lack of data. This burgeoning technique can be used to generate all kinds of datasets including images, audio files, and more. This is often proceeded by during which models are deployed to similar problems where there may be substantial developmental overlap, thus reducing time and effort when compared to starting from scratch.

Data is the new oil
transfer learning
Source: The Economist