Data Science vs Machine Learning vs Deep Learning
Last updated
Last updated
Data Science is used to find insight in data
Machine Learning models make predictions
Deep Learning can take actions autonomously (e.g. drive a car)
Data science is the process of manually extracting insight from data using the basic principles and techniques of statistical analysis. This includes basic tasks like finding p values and confidence intervals. Though fairly complex answers/findings may be derived from this process, these methods follow a specific set of hand-crafted instructions and the complexity and accuracy of the results are fundamentally limited.
In classical terms, a data scientist’s role is focused on analytics, forecasting, visualization, and reporting with the goal of informing business decisions. Since data science evolved as an extension of big data analytics, traditional tools included SAS, SQL, R, Scala, etc.
Today, however, the role of the data scientist has evolved to encompass both ML and DL (which are quite different in terms of the expertise required and daily activity). Companies still require basic analysis, dashboards, reporting, and queries -- all of which are still developed by data scientists. The end result is that the role can and does mean different things at different companies (or even different things at the same company).
Machine learning algorithms parse data, learn from it without human guidance, and then apply that learning to make informed decisions. Machine learning performs well on small datasets and does not require high-end processing on GPUs since execution time is minimal.
Machine learning requires careful understanding of input features. Feature engineering as a whole is a taxing, menial, and error-prone process with a lot of human involvement. One benefit of certain ML algorithms is that their output is interpretable (e.g. logistic regression, decision tree) but this does not apply to all algorithms (e.g. SVM, XGBoost are almost impossible to interpret).
Popular ML algorithms include: Linear Regression, Logistic Regression, SVMs, Nearest Neighbor, Decision Trees, Random Forests, PCA, Naive Bayes Classifier, K-Means Clustering, and Gradient-boosting trees.
Deep learning is an artificial neural network-based approach loosely modeled on the human brain -- namely the biological neural networks found in the neocortex where thinking occurs.
Deep learning is technically defined as a machine learning model with more than one hidden layer. Artificial neural networks (ANNs) require at least three layers: input (features), hidden, and output (prediction). DL algorithms can find much more complex and nuanced patterns than ML algorithms and can operate on almost any type of data.
The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones. If we draw a graph showing how these concepts are built on top of each other, the graph is deep and has many layers. For this reason, we call this approach to AI deep learning.
Deep learning models can find much higher-level or abstract representations of data than machine learning models. For example, if we want to determine whether the subject of an image is a dog or a cat, a DL algorithm accomplishes this by extracting the low-level features such as the edges contained in raw pixels, then composes those edges into shapes, then infers a slightly more abstract concept of a nose and eyes from those shapes, and then from there proceeds to recognize the highly abstract concept of a face.
An often-cited benefit of deep learning models is their ability to perform automatic feature extraction from raw data. A few downsides are the intensive training time, the need for lots of data, and the lack of interpretability.
Ultimately, ML and DL models operate on the same principle: if you feed an algorithm enough data, the machine can analyze it and predict patterns. There is one key difference that matters at scale: due to their somewhat simplistic architecture, the accuracy of ML algorithms will begin to plateau at a certain point regardless if more training data is available. Conversely, the more data you feed a DL model, the more accurately the machine will recognize patterns.
As a result, a DL model has the potential to yield a model with a higher degree of accuracy. Larger networks (that take longer to train) can further improve accuracy. There are diminishing returns at a certain point but this is a key selling point -- especially for companies with lots and lots of data, which in today's world is more the standard than the exception. If you’re Walmart, a 1% improvement of model accuracy in a recommender system could lead to many millions of dollars of additional revenue.
Without data, you can't have machine learning. The machine learning process relies on huge amounts of data for training. This step of preparing data has led to the creation of a field called Data Engineering which is becoming a key, independent function of large organizations.