Linear vs Logistic Regression
Understand the key differences between linear and logistic regression: output type, loss function, activation, decision boundary, and when to use each ā with code and interview-ready explanations.
The Key Difference
| Aspect | Linear Regression | Logistic Regression | |---|---|---| | Task | Regression (continuous output) | Binary classification (0 or 1) | | Output | Any real number | Probability between 0 and 1 | | Output function | Identity (no transformation) | Sigmoid: 1 / (1 + e^-z) | | Loss function | Mean Squared Error (MSE) | Binary Cross-Entropy | | Decision | Predict a number | Predict a class (threshold at 0.5) |
Linear Regression
Predicts a continuous value as a weighted sum of inputs.
prediction = wāxā + wāxā + ... + wāxā + bimport numpy as np
from sklearn.linear_model import LinearRegression
# Predict warfarin dose (mg/day) from patient features
X = np.array([
[65, 78, 1.1], # age, weight, creatinine
[72, 85, 1.4],
[58, 62, 0.9],
])
y = np.array([5.0, 4.5, 6.5]) # dose in mg
model = LinearRegression()
model.fit(X, y)
# Output: any real number
prediction = model.predict([[68, 75, 1.2]])
print(f"Predicted dose: {prediction[0]:.2f} mg") # e.g., 5.12 mg
# The output could theoretically be negative or > 100 ā no constraint
print(model.coef_) # Weight per feature
print(model.intercept_) # Bias termLogistic Regression
Despite the name, logistic regression is a classification algorithm. It applies the sigmoid function to constrain output to [0, 1], then interprets this as a probability.
z = wāxā + wāxā + ... + wāxā + b (same linear combination)
probability = sigmoid(z) = 1 / (1 + e^-z)
class = 1 if probability >= 0.5 else 0from sklearn.linear_model import LogisticRegression
# Predict: needs dose adjustment? (1 = yes, 0 = no)
X = np.array([
[65, 2.4, 1.1, 5], # age, INR, creatinine, current_dose
[72, 1.8, 1.4, 4],
[58, 3.8, 0.9, 6],
[80, 4.5, 1.8, 5],
])
y = np.array([0, 0, 1, 1])
model = LogisticRegression()
model.fit(X, y)
new_patient = np.array([[68, 3.5, 1.2, 5]])
prob = model.predict_proba(new_patient)[0][1] # P(class=1)
print(f"P(needs adjustment) = {prob:.2%}") # 82.34%
print(f"Prediction: {model.predict(new_patient)[0]}") # 1The Sigmoid Function
import numpy as np
import math
def sigmoid(z: float) -> float:
return 1 / (1 + math.exp(-z))
# Properties:
print(sigmoid(0)) # 0.5 ā exactly at the boundary
print(sigmoid(10)) # ~1.0 ā very confident positive
print(sigmoid(-10)) # ~0.0 ā very confident negative
print(sigmoid(2)) # ~0.88 ā likely positiveThe sigmoid "squashes" any real number into (0, 1), making it interpretable as a probability.
Loss Functions Compared
Linear Regression ā MSE Loss
def mse_loss(y_pred, y_true):
return np.mean((y_pred - y_true) ** 2)MSE works for regression because errors can be positive or negative and we want to minimize squared deviations.
Logistic Regression ā Binary Cross-Entropy Loss
def binary_cross_entropy(y_pred: float, y_true: int) -> float:
"""
Penalizes wrong confident predictions harshly.
log(0) ā -inf: if model is 100% wrong, loss is infinite.
"""
eps = 1e-7
y_pred = max(eps, min(1 - eps, y_pred))
return -(y_true * math.log(y_pred) + (1 - y_true) * math.log(1 - y_pred))
# Loss when model is confident and right:
print(binary_cross_entropy(0.95, 1)) # ~0.05 ā low loss
# Loss when model is confident and wrong:
print(binary_cross_entropy(0.05, 1)) # ~3.0 ā very high lossDecision Boundary
Both models have a linear decision boundary ā a hyperplane that separates classes. The difference is in what they do with the output:
Linear Regression: the hyperplane predicts a continuous value
Logistic Regression: the hyperplane separates classes (z=0 ā p=0.5)
Can logistic regression handle non-linear boundaries?
No ā not directly. But:
- Use polynomial features (add x², xāxā, etc.)
- Use kernel methods (SVM with RBF kernel)
- Use neural networksWhy Not Use Linear Regression for Classification?
# Problem 1: output is unbounded
y_pred = 1.8 # Greater than 1 ā not a valid probability
# Problem 2: sensitive to outliers
# One far-away positive example can shift the decision boundary
# Problem 3: MSE is wrong for classification
# MSE penalizes even correct predictions if they're not exactly 0 or 1When to Use Each
| Situation | Use | |---|---| | Predicting a number | Linear Regression | | Predicting a binary outcome | Logistic Regression | | Output must be a probability | Logistic Regression | | Output is unbounded | Linear Regression | | Need coefficient interpretability | Both work | | Multi-class (not binary) | Multinomial Logistic Regression (one-vs-rest or softmax) |
Interview Answer Template
Q: What is the difference between linear and logistic regression?
Both use the same linear combination of features, but they differ in the output and loss function. Linear regression outputs any real number and minimizes mean squared error ā it's used for regression tasks like predicting drug dose. Logistic regression applies the sigmoid function to that linear combination, constraining output to [0, 1], which is interpreted as a probability ā it's used for binary classification. The loss function changes to binary cross-entropy, which penalizes confident wrong predictions harshly. Despite its name, logistic regression is a classification algorithm, not a regression algorithm. The key limitation of both is the linear decision boundary ā for non-linear separation, you need polynomial features, SVMs with kernels, or neural networks.
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.