Learnixo

Deep Learning for AI Interviews · Lesson 2 of 56

When to Use Deep Learning vs Classical ML

The Question to Ask First

Before jumping to deep learning, ask: "Is there a simpler model that could work here?"

Simpler models fail when:
  Input is raw, unstructured (images, text, audio)
  Relationships are highly non-linear with many interactions
  Scale is massive (millions of examples, billions of parameters needed)
  
Simpler models succeed when:
  Input is tabular with engineered features
  Dataset is small (under 10K examples)
  Interpretability is mandatory
  Compute budget is limited

When Deep Learning Is the Right Choice

1. Unstructured input data:
   Images → CNN, Vision Transformer
   Text → Transformer (BERT, GPT)
   Audio → CNN on spectrograms, Wav2Vec
   Video → 3D CNN, Video Transformer
   
   Simpler models cannot handle raw pixels or raw text tokens well.

2. Large datasets (tens of thousands to billions):
   DL improves with data at a rate traditional ML cannot match.
   
3. Transfer learning is available:
   Pre-trained model (ResNet, BERT) can be fine-tuned with moderate data.
   Effectively gives DL the data advantage of a model trained on billions of examples.

4. Performance requirements that simpler models can't meet:
   State-of-the-art requires DL in: object detection, NLP tasks, protein folding,
   speech recognition, game playing.

5. Sequence modelling:
   Time series with complex temporal dependencies
   Language generation
   → RNNs, LSTMs, Transformers handle sequential structure natively.

When NOT to Use Deep Learning

1. Small dataset + tabular data:
   Less than 10K examples, structured features
   → Gradient boosting (XGBoost, LightGBM) likely wins

2. Interpretability required by law or policy:
   Medical device with FDA oversight, credit scoring (FCRA)
   → Logistic regression, decision tree, or rule-based system
   → Deep learning can be used with explainability tools (SHAP, LIME)
     but the model itself is not inherently interpretable

3. Low compute budget:
   No GPU, edge device with limited memory
   → Train smaller traditional models or use distillation

4. Very fast iteration needed:
   Research question where you need many experiments quickly
   → Start with simpler models to establish baselines, then go deep

5. The problem has a clean analytical solution:
   Linear relationships, rule-based systems where rules are known
   → Don't add unnecessary complexity

The Deep Learning Tax

Switching from XGBoost to a neural network adds:

Training time:   minutes → hours (or days)
Tuning effort:   hyperparameter search is more complex
Data required:   10× more labelled examples often needed
Compute:         need GPU (≥ $1K or cloud cost)
Debugging:       much harder — silent failures are common
Deployment:      larger model files, latency management
Interpretability: harder to explain to clinical stakeholders

Make sure the performance gain justifies these costs.

Practical Decision Tree

Python
def should_use_deep_learning(
    data_type: str,           # "tabular", "image", "text", "audio", "video"
    n_labelled_samples: int,
    has_pretrained_model: bool,
    interpretability_mandatory: bool,
    has_gpu_budget: bool,
) -> tuple[bool, str]:
    
    # Unstructured data: DL almost always needed
    if data_type in ("image", "audio", "video"):
        if not has_gpu_budget:
            return False, "Need GPU for image/audio DL — consider cloud or resize scope"
        return True, f"DL needed for {data_type} data"
    
    if data_type == "text":
        if has_pretrained_model:
            return True, "Use pre-trained LLM/BERT — fine-tuning is cheap"
        if n_labelled_samples < 1000:
            return False, "Too few samples for training from scratch — use TF-IDF + LR"
        return True, "DL for text (Transformer)"
    
    # Tabular data
    if data_type == "tabular":
        if interpretability_mandatory:
            return False, "Use logistic regression or decision tree for interpretability"
        if n_labelled_samples < 5000:
            return False, "Use XGBoost — DL underperforms with few tabular samples"
        if has_gpu_budget and n_labelled_samples > 100_000:
            return True, "Consider tabular DL (TabNet, MLP) — test both vs XGBoost"
        return False, "XGBoost is likely sufficient for this tabular problem"
    
    return False, "Default: start simple, add complexity only if needed"


result, reason = should_use_deep_learning(
    data_type="text",
    n_labelled_samples=500,
    has_pretrained_model=True,
    interpretability_mandatory=False,
    has_gpu_budget=True,
)
print(f"Use DL: {result} — {reason}")

Interview Answer

"Deep learning is warranted when: the input is unstructured (images, text, audio — no simpler model handles raw pixels or tokens well); the dataset is large (tens of thousands or more, unless transfer learning is available); or state-of-the-art performance is required. For tabular data with moderate-sized datasets, gradient boosted trees (XGBoost) often outperform neural networks with less compute and better interpretability. The deep learning tax — more data, compute, tuning time, and harder debugging — must be justified by measurable performance gains. My default for a new problem: start with XGBoost as the baseline, then evaluate whether the performance gap justifies switching to deep learning."