What is Prompt Engineering? — Prompt Engineering Mastery | Learnixo

Definition

Prompt engineering is the practice of designing, structuring, and iterating on the text inputs given to a language model to reliably elicit desired outputs.

LLM = black box that maps text → text

Prompt engineering = crafting the input text to:
  - Get the correct answer or format
  - Get consistent, reproducible behaviour
  - Avoid harmful, incorrect, or off-topic outputs
  - Satisfy latency and cost constraints (fewer tokens)

It is not magic — it is software engineering applied to a probabilistic text interface.

The Mental Model

An LLM is a next-token predictor trained on human text. Prompts work by exploiting the patterns the model has learned:

"The capital of France is" → very high P("Paris") — factual completion

"Translate to Spanish: 'The patient takes Warfarin'"
  → high P("El paciente toma Warfarina") — follows a pattern the model has
     seen thousands of times in translation corpora

"You are a triage nurse. Given these vitals, classify urgency:"
  → activates role-relevant training examples — model behaves as if
     it is a nurse completing clinical documentation

Every prompt is a compressed program that exploits the model's learned distributions.

Why Prompt Engineering Exists

Fine-tuning changes model weights — expensive, requires data, risks forgetting. Prompting changes the input — free, fast, reversible. For many tasks, a well-crafted prompt can match or approach fine-tuned performance:

Task: classify clinical notes by ICD-10 code

Fine-tuning approach:
  Collect 10,000 labelled examples
  Fine-tune BERT for 3 hours on a GPU cluster
  Deploy as a separate model endpoint
  Cost: $$$ (data collection, compute, maintenance)

Prompting approach:
  Write a prompt with 3-5 examples and clear instructions
  Call GPT-4 or LLaMA via API
  Iterate in an afternoon
  Cost: $ (API calls per query)

When fine-tuning wins: very high volume, strict latency, highly specialised domain
When prompting wins: low volume, fast iteration, general-purpose model is good enough

Prompt Engineering Is Not Stable

Prompts are fragile — they can fail on:

Model updates: the same prompt on GPT-4 vs GPT-4o may produce different results
Temperature: at T=0.9, responses vary; at T=0, more consistent but less creative
Token budget: truncated prompts behave differently
Edge cases: unusual inputs the prompt doesn't explicitly handle
Adversarial inputs: users who try to override the prompt (injection)

Good prompt engineering includes evaluation: measure whether the prompt works across the full expected input distribution, not just the happy path.

Types of Prompting

Zero-shot:   just the instruction, no examples
  "Classify the sentiment: 'The drug had no side effects' → "

Few-shot:    instruction + 3-5 examples before the query
  "Classify sentiment (positive/negative/neutral):
   'Terrible pain' → negative
   'No issues'     → positive
   'The drug had no side effects' → "

Chain-of-thought: instruct the model to reason step by step before answering
  "Think step by step, then classify..."

System prompts: separate role/instruction layer in chat models
  system: "You are a clinical coding assistant..."
  user:   "Code this note: ..."

Interview Answer

"Prompt engineering is the practice of designing text inputs to reliably elicit desired outputs from LLMs. It works by exploiting the patterns the model learned during pretraining — a well-structured prompt activates relevant learned behaviour without changing model weights. It's an alternative to fine-tuning when iteration speed or cost matters. Good prompt engineering includes evaluation on a representative test set: a prompt that works on 3 examples may fail on 30. The challenge is that prompts are fragile across model versions, input distributions, and adversarial inputs."