# Logistic Regression

![Source: Technology of Computing](https://2327526407-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LvBP1svpACTB1R1x_U4%2F-Lw70vAIGPfRR1AjprLi%2F-LwAVc1EdfmPMge5dlYC%2Fimage.png?alt=media\&token=d72e3231-0d64-4bb7-9e4c-20577940763d)

Logistic regression is a machine learning algorithm used for classification problems.  The term logistic is derived from the cost function (logistic function) which is a type of **sigmoid function** known for its characteristic S-shaped curve.  A logistic regression model predicts probability values which are mapped to two (binary classification) or more (multiclass classification) classes.

![Source: Analytics India Magazine](https://2327526407-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LvBP1svpACTB1R1x_U4%2F-Lw70vAIGPfRR1AjprLi%2F-LwAYhg8-OqxqG7aWKhN%2Fimage.png?alt=media\&token=e2d056d0-fecc-40a2-8460-bd2ea82f9580)

![Formula of a sigmoid function](https://2327526407-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LvBP1svpACTB1R1x_U4%2F-Lw70vAIGPfRR1AjprLi%2F-LwAgUAGfE21Zw1J-gnb%2Fimage.png?alt=media\&token=87ef1b15-f2fe-44f7-816d-a794a1bfc578)

Where:

* 1 = the curve's maximum value
* *S(z)*  = output between 0 and 1 (probability estimate)
* *z* = the input
* *e* = base of natural log (also known as Euler's number)

In multiclass classification with logistic regression, a **softmax function** is used instead of the sigmoid function. &#x20;

Like [linear regression](https://machine-learning.paperspace.com/wiki/linear-regression), [gradient descent](https://machine-learning.paperspace.com/wiki/gradient-descent) is typically used to optimize the values of the coefficients (each input value or column) by iteratively minimizing the loss of the model during training. &#x20;

The **decision boundary** is the acceptable threshold at which a probability can be mapped to a discrete class e.g. pass/fail or vegan/vegetarian/omnivore. &#x20;

The cost function in logistic regression is more complex than linear regression. For example, mean squared error would yield a non-convex function with many local minimums, making it difficult to optimize with gradient descent. **Cross entropy**, also called **log loss** is the cost function used with logistic regression.

**Regularization** is a technique used to prevent [overfitting ](https://machine-learning.paperspace.com/wiki/overfitting-vs-underfitting)by penalizing signals that provide too much explanatory power to a single feature.  Regularization is extremely important in logistic regression.

[Accuracy](https://machine-learning.paperspace.com/accuracy-and-loss#accuracy), a [model evaluation metric](https://machine-learning.paperspace.com/wiki/metrics-in-machine-learning), is used to measure how accurate a model's predictions are -- this is expressed as the number of true classifications divided by the total.

### Linear vs Logistic Regression

**Linear regression** predictions are continuous (e.g. test scores from 0-100).

**Logistic regression** predictions classify items where only specific values or classes are allowed (e.g. binary classification or multiclass classification). The model provides a probability score (confidence) with each prediction.
