AI for Developers · Lesson 3 of 6

Prompt Engineering Mastery

What Is Prompt Engineering?

A prompt is the input you give to an LLM. Prompt engineering is the practice of designing inputs that reliably produce the outputs you want.

The same model can produce wildly different results from different prompts. Prompt engineering isn't a workaround — it's a core skill for anyone building with AI.

❌ Bad prompt:    "Summarise this."
✅ Good prompt:   "Summarise the following customer support ticket in 2 sentences.
                   Focus on the issue and resolution. Tone: professional.
                   Output only the summary — no preamble."

The difference in output quality is dramatic.


The Anatomy of a Prompt

Every LLM call has three message roles:

System message  → sets the model's persona, rules, and constraints
                  (persists for the whole conversation)

User message    → the actual request from the user

Assistant message → previous model responses (for multi-turn conversations)
C#
var messages = new List<ChatMessage>
{
    // System: what the model IS
    ChatMessage.CreateSystemMessage("""
        You are a senior .NET developer reviewing pull requests.
        Be specific, concise, and focus on correctness over style.
        Always include a code example when suggesting a change.
        Never praise — only flag issues or confirm correctness.
        """),

    // User: what the user WANTS
    ChatMessage.CreateUserMessage($"""
        Review this C# method:

        ```csharp
        {code}

"""), };


**System prompts are your most powerful lever.** Spend more time on the system prompt than the user message — it defines how the model behaves for every request.

---

## Level 1: Zero-Shot Prompting

Ask directly — no examples needed. Works for simple, well-defined tasks:

Classify the sentiment of this review as Positive, Neutral, or Negative.

Review: "The delivery was late but the product itself is great."

Answer with one word only.


Output: `Positive`

Tips for zero-shot:
- Be explicit about the output format
- Specify length: "in one sentence", "in under 50 words", "as a JSON object"
- State what NOT to include: "no preamble", "no explanation", "output only the code"

---

## Level 2: Few-Shot Prompting

Give 2–5 examples of input → output pairs. The model learns the pattern:

Classify support tickets by urgency: Critical, High, Medium, Low.

Examples: Ticket: "Production server is down, no one can log in." Urgency: Critical

Ticket: "Dashboard charts are loading slowly." Urgency: Medium

Ticket: "Can you add dark mode?" Urgency: Low

Now classify: Ticket: "Payment processing is failing for 30% of users." Urgency:


Output: `Critical`

Few-shot is powerful because you're teaching the *pattern* rather than explaining the rules. It works especially well for:
- Classification and labelling
- Format transformation (markdown → JSON, etc.)
- Tone matching

---

## Level 3: Chain-of-Thought (CoT)

For reasoning tasks, tell the model to think step by step before answering. This dramatically improves accuracy on logic, maths, and multi-step problems.

❌ Direct question (often wrong): A train leaves London at 9:00 am travelling at 120 mph. Another leaves Birmingham (110 miles away) at 9:30 am at 150 mph. At what time do they meet?

✅ Chain-of-thought: A train leaves London at 9:00 am travelling at 120 mph. Another leaves Birmingham (110 miles away) at 9:30 am at 150 mph. At what time do they meet?

Think step by step before giving the final answer.


You can also use **"Let's think about this carefully"** or structure it explicitly:

Answer the following question.

Question:

First, identify all relevant facts. Then, reason through each step. Finally, state the answer clearly.


For code generation, CoT looks like this:

Write a C# method that finds the two numbers in a list that sum to a target.

Before writing code:

  1. Describe the algorithm you'll use
  2. Identify the time and space complexity
  3. List edge cases to handle

Then write the implementation.


---

## Level 4: Structured Output (JSON Mode)

For any application code, you want predictable, parseable output — not free text.

```csharp
// Force JSON output
var options = new ChatCompletionOptions
{
    ResponseFormat = ChatResponseFormat.CreateJsonObjectFormat(),
    Temperature    = 0.1f,
};

var messages = new List
{
    ChatMessage.CreateSystemMessage("""
        You are an API that extracts structured data.
        Always respond with valid JSON only. No markdown, no explanation.
        """),
    ChatMessage.CreateUserMessage($"""
        Extract the following fields from this invoice text:
        - vendor_name (string)
        - invoice_date (ISO 8601 date)
        - total_amount (number)
        - line_items (array of {{ description, quantity, unit_price }})

        Invoice text:
        {invoiceText}
        """),
};

var response = await chatClient.CompleteChatAsync(messages, options, ct);
var json     = response.Value.Content[0].Text;
var invoice  = JsonSerializer.Deserialize(json);

JSON Schema enforcement (gpt-4o, Structured Outputs):

C#
// Strict structured output — model MUST match your schema
var options = new ChatCompletionOptions();
options.ResponseFormat = ChatResponseFormat.CreateJsonSchemaFormat(
    "invoice",
    BinaryData.FromString("""
    {
      "type": "object",
      "properties": {
        "vendor_name":    { "type": "string" },
        "invoice_date":   { "type": "string", "format": "date" },
        "total_amount":   { "type": "number" },
        "line_items": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "description": { "type": "string" },
              "quantity":    { "type": "integer" },
              "unit_price":  { "type": "number" }
            },
            "required": ["description", "quantity", "unit_price"]
          }
        }
      },
      "required": ["vendor_name", "invoice_date", "total_amount", "line_items"],
      "additionalProperties": false
    }
    """),
    strictSchemaEnabled: true
);

Level 5: The System Prompt as Your Product

Think of the system prompt as your product's brain. For any AI feature you ship, the system prompt is the most important piece of code you'll write.

Anatomy of a production system prompt:

[IDENTITY]
You are a code review assistant for Learnixo. You help developers improve 
their .NET and C# code. You are direct, technically precise, and never waste words.

[RULES]
- Only review C# / .NET code. Politely decline other requests.
- Flag bugs before style issues.
- Always show a corrected code snippet for every issue you raise.
- Never produce output longer than 500 words.
- If the code is correct, say so in one sentence and stop.

[OUTPUT FORMAT]
Respond in this exact structure:
## Issues Found
(list each issue with a before/after code example, or write "None" if no issues)

## Summary
(one sentence)

[TONE]
Senior developer peer review. No encouragement, no filler.

Rules for system prompts:

  1. Identity first — tell the model what it is before what to do
  2. Explicit constraints — what it should NOT do is as important as what it should
  3. Output format — define the exact structure you expect every time
  4. Tone — be specific: "professional but concise", "explain like I'm 5", "terse"
  5. Length control — always set a word/sentence limit for predictable output sizes

Level 6: Retrieval-Augmented Generation (RAG)

The model has a training cutoff and doesn't know your private data. RAG injects relevant context into the prompt at query time:

[SYSTEM]
You are a customer support agent for Learnixo. 
Answer questions using ONLY the context provided below.
If the answer is not in the context, say "I don't have that information."
Never make up information. Keep answers to 3 sentences max.

[CONTEXT]
{retrieved_documents}

[USER QUESTION]
{user_question}

The context comes from a vector database search — you embed the question, find the most similar document chunks, and inject them. This is covered in depth in the Production RAG Pipeline lesson.


Level 7: ReAct — Reason + Act

ReAct (Reasoning + Acting) is a pattern for multi-step tasks where the model needs to use tools:

You are a research assistant. You have access to these tools:
- search(query: string) → returns web search results
- calculate(expression: string) → evaluates a mathematical expression

For complex questions, use this format:
Thought: [your reasoning about what to do next]
Action: [tool name and input]
Observation: [tool result]
... (repeat Thought/Action/Observation as needed)
Final Answer: [your conclusion]

Question: What is 15% of the GDP of the UK in 2024?

Model output:

Thought: I need to find the UK GDP in 2024 first.
Action: search("UK GDP 2024 total")
Observation: UK GDP in 2024 was approximately £2.5 trillion ($3.1 trillion)
Thought: Now I can calculate 15% of £2.5 trillion.
Action: calculate("2500000000000 * 0.15")
Observation: 375000000000
Final Answer: 15% of the UK's GDP in 2024 is approximately £375 billion.

This is the foundation of AI agents. In practice, OpenAI's function/tool calling API automates this loop — covered in the AI Agents lesson.


Practical Patterns by Use Case

Code generation

You are an expert C# developer. Write production-quality code only.
Include XML doc comments on public members.
Use async/await throughout. Throw descriptive exceptions.
No placeholder comments like "// TODO" or "// implement this".

Task: {task}

Summarisation

Summarise the following article for a technical audience.

Rules:
- Maximum 5 bullet points
- Each bullet: one sentence
- Focus on actionable insights, not background
- Use plain language — no jargon unless necessary

Article:
{article}

Data extraction

Extract all dates mentioned in the following text.
Return a JSON array of objects: [{ "date": "YYYY-MM-DD", "context": "what the date refers to" }]
If no dates found, return [].
Output JSON only — no markdown, no explanation.

Text: {text}

Classification

Classify the following support message into exactly one category:
billing | technical | feature-request | abuse | other

Rules:
- Output only the category name — nothing else
- If unsure between two, pick the more severe one
- "abuse" takes priority over all others

Message: {message}

Prompt Anti-Patterns to Avoid

❌ Vague instructions
"Make it better" — better how? shorter? more formal? more accurate?

✅ "Rewrite this paragraph to be 30% shorter without losing any key information."

---

❌ Contradictory rules
"Be concise but comprehensive."

✅ "Cover these 3 points only: [list]. One sentence per point."

---

❌ No output format
"Extract the important info."

✅ "Extract: name, email, company. Return JSON: { name, email, company }.
    If a field is missing, use null."

---

❌ Relying on the model to "just know"
"Write code in our style."

✅ Provide 2–3 examples of your code style in the system prompt.

---

❌ No length constraint
"Write a blog post about X."

✅ "Write a 400-word blog post about X. No introduction paragraph — start with the main point."

Testing Your Prompts

Treat prompts like code — test them systematically:

C#
// Run your prompt against a set of test cases
var testCases = new[]
{
    new { Input = "Production server down", Expected = "Critical" },
    new { Input = "Dashboard is slow",      Expected = "Medium"   },
    new { Input = "Add dark mode",          Expected = "Low"      },
};

var passed = 0;
foreach (var test in testCases)
{
    var result = await classifier.ClassifyAsync(test.Input);
    if (result == test.Expected) passed++;
    else Console.WriteLine($"FAIL: '{test.Input}''{result}' (expected '{test.Expected}')");
}

Console.WriteLine($"{passed}/{testCases.Length} tests passed.");

Use evals (evaluation sets) — a collection of input/expected output pairs — and run them on every prompt change. This is how production teams catch regressions when switching models or updating system prompts.


Key Takeaways

  • System prompts define the model's behaviour — invest here first
  • Zero-shot works for simple tasks; few-shot teaches patterns from examples
  • Chain-of-thought ("think step by step") dramatically improves reasoning quality
  • Structured output (JSON mode / JSON Schema) makes AI responses parseable and reliable
  • Output format constraints (length, structure, tone) are not optional — always specify them
  • ReAct is the foundation of AI agents — reason, use a tool, observe, repeat
  • Test your prompts with eval sets — they drift when you change models or system prompts
  • The best prompt is the most specific one: vague in → vague out, precise in → precise out
Lesson Checkpoint
Quick CheckQuestion 1 of 4

What does chain-of-thought prompting add to get better reasoning?