Comparison of ML Frameworks
Last updated
Last updated
There are several popular frameworks for DNNs and classical ML. All widely used frameworks are open source. Most but not all support GPU acceleration.
TensorFlow (by Google): Offers training, distributed training, and inference (TensorFlow Serving) as well as other capabilities such as TFLite (mobile, embedded), Federated Learning (compute on end-user device, share learnings centrally), TensorFlow.js, (web-native ML), TFX for platform etc. TensorFlow is widely adopted, especially in enterprise/production-grade ML.
Keras (also by Google): A higher-level wrapper around TensorFlow and other frameworks such as R and CNTK, which form the “backend” to Keras in this context.
PyTorch (by Facebook): An easy-to-use framework known for rapid prototyping. Facebook recently merged Caffe2 into the PyTorch project to support productionalizing and serving PyTorch-based models. PyTorch is especially popular in the research community.
Fast.ai (by Fast.ai team): A library that sits on top of PyTorch to simplify and accelerate deep learning training. Fast.ai is very new and its full reach is not yet known.
Microsoft Cognitive Toolkit* (by Microsoft): A framework focused on large-scale production deployments. The community is small relative to other frameworks. * Formerly CNTK
MXNet (by Apache but associated with Amazon): An open source deep learning framework focused on large-scale production deployments. MXNet is popular at Microsoft, Intel, and Amazon but not in the research community.
Gluon (by Amazon and Microsoft): An attempt to create a Keras-like API layer for MXNet and CNTK. Gluon is not very popular.
Chainer (by a Japanese company called Preferred Networks): A deep learning framework that is popular in Japan and supported by tech giants such as IBM, NVIDIA, AWS, and Intel. That said, the Chainer community is relatively small.
PaddlePaddle (by Baidu): A scalable deep learning platform originally developed for use on Baidu products that is focused on large-scale production deployments. The PaddlePaddle community is relatively small.
Deeplearning4j (by Konduit, related to Eclipse): A deep learning programming library built for companies that need support for Java and Scala. The community is relatively small.
Caffe & Caffe2 (by UC Berkeley): A deep learning framework that is especially suited to image classification and image segmentation. Caffe is not popular anymore but Facebook created a successor called Caffe2 which was recently merged into PyTorch.
These frameworks are mostly used for “Classical ML” rather than Deep Learning:
XGBoost: An open-source library built for one of the most common machine learning algorithms, gradient boosting. The community is very large.
Scikit-learn: A machine learning library that provides algorithms for many standard machine learning tasks such as clustering, regression, classification, dimensionality reduction, and more. The community is very large.
Other popular machine learning libraries are typically used when preparing data for later use in ML frameworks. These include:
NumPy (arrays & linear algebra library)
SciPy (scientific computing library)
Pandas (data extraction & preparation)
Matplotlib (plotting & data visualization)