Learnixo
Back to blog
AI Systemsintermediate

Linear vs Logistic Regression

Understand the key differences between linear and logistic regression: output type, loss function, activation, decision boundary, and when to use each — with code and interview-ready explanations.

Asma Hafeez KhanMay 16, 20265 min read
Machine LearningLinear RegressionLogistic RegressionClassificationRegressionInterview
Share:š•

The Key Difference

| Aspect | Linear Regression | Logistic Regression | |---|---|---| | Task | Regression (continuous output) | Binary classification (0 or 1) | | Output | Any real number | Probability between 0 and 1 | | Output function | Identity (no transformation) | Sigmoid: 1 / (1 + e^-z) | | Loss function | Mean Squared Error (MSE) | Binary Cross-Entropy | | Decision | Predict a number | Predict a class (threshold at 0.5) |


Linear Regression

Predicts a continuous value as a weighted sum of inputs.

prediction = w₁x₁ + wā‚‚xā‚‚ + ... + wā‚™xā‚™ + b
Python
import numpy as np
from sklearn.linear_model import LinearRegression

# Predict warfarin dose (mg/day) from patient features
X = np.array([
    [65, 78, 1.1],   # age, weight, creatinine
    [72, 85, 1.4],
    [58, 62, 0.9],
])
y = np.array([5.0, 4.5, 6.5])   # dose in mg

model = LinearRegression()
model.fit(X, y)

# Output: any real number
prediction = model.predict([[68, 75, 1.2]])
print(f"Predicted dose: {prediction[0]:.2f} mg")   # e.g., 5.12 mg

# The output could theoretically be negative or > 100 — no constraint
print(model.coef_)       # Weight per feature
print(model.intercept_)  # Bias term

Logistic Regression

Despite the name, logistic regression is a classification algorithm. It applies the sigmoid function to constrain output to [0, 1], then interprets this as a probability.

z = w₁x₁ + wā‚‚xā‚‚ + ... + wā‚™xā‚™ + b   (same linear combination)
probability = sigmoid(z) = 1 / (1 + e^-z)
class = 1 if probability >= 0.5 else 0
Python
from sklearn.linear_model import LogisticRegression

# Predict: needs dose adjustment? (1 = yes, 0 = no)
X = np.array([
    [65, 2.4, 1.1, 5],   # age, INR, creatinine, current_dose
    [72, 1.8, 1.4, 4],
    [58, 3.8, 0.9, 6],
    [80, 4.5, 1.8, 5],
])
y = np.array([0, 0, 1, 1])

model = LogisticRegression()
model.fit(X, y)

new_patient = np.array([[68, 3.5, 1.2, 5]])
prob = model.predict_proba(new_patient)[0][1]   # P(class=1)
print(f"P(needs adjustment) = {prob:.2%}")      # 82.34%
print(f"Prediction: {model.predict(new_patient)[0]}")  # 1

The Sigmoid Function

Python
import numpy as np
import math

def sigmoid(z: float) -> float:
    return 1 / (1 + math.exp(-z))

# Properties:
print(sigmoid(0))     # 0.5   — exactly at the boundary
print(sigmoid(10))    # ~1.0  — very confident positive
print(sigmoid(-10))   # ~0.0  — very confident negative
print(sigmoid(2))     # ~0.88 — likely positive

The sigmoid "squashes" any real number into (0, 1), making it interpretable as a probability.


Loss Functions Compared

Linear Regression — MSE Loss

Python
def mse_loss(y_pred, y_true):
    return np.mean((y_pred - y_true) ** 2)

MSE works for regression because errors can be positive or negative and we want to minimize squared deviations.

Logistic Regression — Binary Cross-Entropy Loss

Python
def binary_cross_entropy(y_pred: float, y_true: int) -> float:
    """
    Penalizes wrong confident predictions harshly.
    log(0) → -inf: if model is 100% wrong, loss is infinite.
    """
    eps = 1e-7
    y_pred = max(eps, min(1 - eps, y_pred))
    return -(y_true * math.log(y_pred) + (1 - y_true) * math.log(1 - y_pred))

# Loss when model is confident and right:
print(binary_cross_entropy(0.95, 1))   # ~0.05 — low loss

# Loss when model is confident and wrong:
print(binary_cross_entropy(0.05, 1))   # ~3.0 — very high loss

Decision Boundary

Both models have a linear decision boundary — a hyperplane that separates classes. The difference is in what they do with the output:

Linear Regression:  the hyperplane predicts a continuous value
Logistic Regression: the hyperplane separates classes (z=0 → p=0.5)

Can logistic regression handle non-linear boundaries?
  No — not directly. But:
  - Use polynomial features (add x², x₁xā‚‚, etc.)
  - Use kernel methods (SVM with RBF kernel)
  - Use neural networks

Why Not Use Linear Regression for Classification?

Python
# Problem 1: output is unbounded
y_pred = 1.8   # Greater than 1 — not a valid probability

# Problem 2: sensitive to outliers
# One far-away positive example can shift the decision boundary

# Problem 3: MSE is wrong for classification
# MSE penalizes even correct predictions if they're not exactly 0 or 1

When to Use Each

| Situation | Use | |---|---| | Predicting a number | Linear Regression | | Predicting a binary outcome | Logistic Regression | | Output must be a probability | Logistic Regression | | Output is unbounded | Linear Regression | | Need coefficient interpretability | Both work | | Multi-class (not binary) | Multinomial Logistic Regression (one-vs-rest or softmax) |


Interview Answer Template

Q: What is the difference between linear and logistic regression?

Both use the same linear combination of features, but they differ in the output and loss function. Linear regression outputs any real number and minimizes mean squared error — it's used for regression tasks like predicting drug dose. Logistic regression applies the sigmoid function to that linear combination, constraining output to [0, 1], which is interpreted as a probability — it's used for binary classification. The loss function changes to binary cross-entropy, which penalizes confident wrong predictions harshly. Despite its name, logistic regression is a classification algorithm, not a regression algorithm. The key limitation of both is the linear decision boundary — for non-linear separation, you need polynomial features, SVMs with kernels, or neural networks.

Enjoyed this article?

Explore the AI Systems learning path for more.

Found this helpful?

Share:š•

Leave a comment

Have a question, correction, or just found this helpful? Leave a note below.