Synthetic Data

Data is the new oil. The aphorism is a bit cliche but it is true that the tech giants have benefited disproportionately from AI which is due in no small part to the amount of data they collect.

Companies that are not Google, Facebook, Amazon et al. often do not have enough data to train models accurately -- especially in the case of training deep neural networks that require more data than classical machine learning algorithms.

Creation of fake data, called synthetic data, is one way of overcoming the lack of data. This burgeoning technique can be used to generate all kinds of datasets including images, audio files, and more. This is often proceeded by transfer learning during which models are deployed to similar problems where there may be substantial developmental overlap, thus reducing time and effort when compared to starting from scratch.

PreviousServerless ML: FaaS and Lambda NextStructured vs Unstructured Data

Last updated 5 years ago

Was this helpful?