When to Use Deep Learning

The Question to Ask First

Before jumping to deep learning, ask: "Is there a simpler model that could work here?"

Simpler models fail when:
  Input is raw, unstructured (images, text, audio)
  Relationships are highly non-linear with many interactions
  Scale is massive (millions of examples, billions of parameters needed)
  
Simpler models succeed when:
  Input is tabular with engineered features
  Dataset is small (under 10K examples)
  Interpretability is mandatory
  Compute budget is limited

When Deep Learning Is the Right Choice

1. Unstructured input data:
   Images → CNN, Vision Transformer
   Text → Transformer (BERT, GPT)
   Audio → CNN on spectrograms, Wav2Vec
   Video → 3D CNN, Video Transformer
   
   Simpler models cannot handle raw pixels or raw text tokens well.

2. Large datasets (tens of thousands to billions):
   DL improves with data at a rate traditional ML cannot match.
   
3. Transfer learning is available:
   Pre-trained model (ResNet, BERT) can be fine-tuned with moderate data.
   Effectively gives DL the data advantage of a model trained on billions of examples.

4. Performance requirements that simpler models can't meet:
   State-of-the-art requires DL in: object detection, NLP tasks, protein folding,
   speech recognition, game playing.

5. Sequence modelling:
   Time series with complex temporal dependencies
   Language generation
   → RNNs, LSTMs, Transformers handle sequential structure natively.

When NOT to Use Deep Learning

1. Small dataset + tabular data:
   Less than 10K examples, structured features
   → Gradient boosting (XGBoost, LightGBM) likely wins

2. Interpretability required by law or policy:
   Medical device with FDA oversight, credit scoring (FCRA)
   → Logistic regression, decision tree, or rule-based system
   → Deep learning can be used with explainability tools (SHAP, LIME)
     but the model itself is not inherently interpretable

3. Low compute budget:
   No GPU, edge device with limited memory
   → Train smaller traditional models or use distillation

4. Very fast iteration needed:
   Research question where you need many experiments quickly
   → Start with simpler models to establish baselines, then go deep

5. The problem has a clean analytical solution:
   Linear relationships, rule-based systems where rules are known
   → Don't add unnecessary complexity

The Deep Learning Tax

Switching from XGBoost to a neural network adds:

Training time:   minutes → hours (or days)
Tuning effort:   hyperparameter search is more complex
Data required:   10× more labelled examples often needed
Compute:         need GPU (≥ $1K or cloud cost)
Debugging:       much harder — silent failures are common
Deployment:      larger model files, latency management
Interpretability: harder to explain to clinical stakeholders

Make sure the performance gain justifies these costs.

Practical Decision Tree

Python

def should_use_deep_learning(
    data_type: str,           # "tabular", "image", "text", "audio", "video"
    n_labelled_samples: int,
    has_pretrained_model: bool,
    interpretability_mandatory: bool,
    has_gpu_budget: bool,
) -> tuple[bool, str]:
    
    # Unstructured data: DL almost always needed
    if data_type in ("image", "audio", "video"):
        if not has_gpu_budget:
            return False, "Need GPU for image/audio DL — consider cloud or resize scope"
        return True, f"DL needed for {data_type} data"
    
    if data_type == "text":
        if has_pretrained_model:
            return True, "Use pre-trained LLM/BERT — fine-tuning is cheap"
        if n_labelled_samples < 1000:
            return False, "Too few samples for training from scratch — use TF-IDF + LR"
        return True, "DL for text (Transformer)"
    
    # Tabular data
    if data_type == "tabular":
        if interpretability_mandatory:
            return False, "Use logistic regression or decision tree for interpretability"
        if n_labelled_samples < 5000:
            return False, "Use XGBoost — DL underperforms with few tabular samples"
        if has_gpu_budget and n_labelled_samples > 100_000:
            return True, "Consider tabular DL (TabNet, MLP) — test both vs XGBoost"
        return False, "XGBoost is likely sufficient for this tabular problem"
    
    return False, "Default: start simple, add complexity only if needed"


result, reason = should_use_deep_learning(
    data_type="text",
    n_labelled_samples=500,
    has_pretrained_model=True,
    interpretability_mandatory=False,
    has_gpu_budget=True,
)
print(f"Use DL: {result} — {reason}")

Interview Answer

"Deep learning is warranted when: the input is unstructured (images, text, audio — no simpler model handles raw pixels or tokens well); the dataset is large (tens of thousands or more, unless transfer learning is available); or state-of-the-art performance is required. For tabular data with moderate-sized datasets, gradient boosted trees (XGBoost) often outperform neural networks with less compute and better interpretability. The deep learning tax — more data, compute, tuning time, and harder debugging — must be justified by measurable performance gains. My default for a new problem: start with XGBoost as the baseline, then evaluate whether the performance gap justifies switching to deep learning."

When to Use Deep Learning

The Question to Ask First

When Deep Learning Is the Right Choice

When NOT to Use Deep Learning

The Deep Learning Tax

Practical Decision Tree

Interview Answer

Enjoyed this article?

Leave a comment