Learnixo
Back to blog
AI Systemsintermediate

AI Agent Memory — Giving Agents Context Over Time

Implement memory for AI agents in .NET: in-session chat history, short-term conversation context, long-term memory with vector search, semantic memory with Semantic Kernel, and memory safety for clinical systems.

Asma Hafeez KhanMay 16, 20267 min read
AI AgentsMemorySemantic Kernel.NETVector Search
Share:𝕏

Types of Agent Memory

Stateless agent (no memory):
  Turn 1: "What is the INR for MRN-001?" → 2.4
  Turn 2: "Is that in range?"             → "What are you referring to?"
  The agent has no context from turn 1.

In-session memory (chat history):
  Turn 1: "What is the INR for MRN-001?" → 2.4
  Turn 2: "Is that in range?"             → "Yes — 2.4 is within therapeutic range (2.0–3.0)"
  The agent remembers the conversation so far.

Long-term memory (vector store):
  Session 1: "Patient MRN-001 has a penicillin allergy."
  [session ends — no in-memory state]
  Session 2: "Can we prescribe Amoxicillin for MRN-001?"
  Agent retrieves: "Patient MRN-001: documented penicillin allergy (2024-11-03)"
  → "Amoxicillin is a penicillin-class antibiotic. Patient MRN-001 has a documented
     penicillin allergy and should not be prescribed Amoxicillin."

Four memory types:
  → In-session (chat history):       within one conversation
  → Short-term (ephemeral store):    within a user session, cleared on logout
  → Long-term (vector/SQL store):    persists across sessions
  → Semantic (retrieved by meaning): long-term memories retrieved by relevance

In-Session Memory: Chat History

C#
// Chat history is the simplest form of agent memory
// It is passed with every request — the AI sees the full conversation

public sealed class ClinicalCopilotSession
{
    private readonly ChatHistory           _history;
    private readonly IChatCompletionService _chat;
    private readonly Kernel                _kernel;

    public ClinicalCopilotSession(IChatCompletionService chat, Kernel kernel)
    {
        _chat    = chat;
        _kernel  = kernel;
        _history = new ChatHistory("""
            You are a pharmacist assistant for a clinical prescription system.
            You remember previous questions in this conversation and build on them.
            You only answer questions about prescriptions and medication data.
            """);
    }

    public async Task<string> AskAsync(string question, CancellationToken ct)
    {
        _history.AddUserMessage(question);

        var settings = new OpenAIPromptExecutionSettings
        {
            ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions,
            Temperature      = 0.2
        };

        var response = await _chat.GetChatMessageContentAsync(
            _history, settings, _kernel, ct);

        // Store assistant response in history — maintains conversation context
        _history.AddAssistantMessage(response.Content ?? string.Empty);

        return response.Content ?? string.Empty;
    }

    // Prevent unbounded growth — trim history when it gets too large
    public void TrimHistory(int maxTurns = 20)
    {
        var systemMessage = _history.FirstOrDefault(m => m.Role == AuthorRole.System);
        var recentTurns   = _history
            .Where(m => m.Role != AuthorRole.System)
            .TakeLast(maxTurns * 2) // each turn = user + assistant message
            .ToList();

        _history.Clear();
        if (systemMessage is not null)
            _history.Add(systemMessage);
        foreach (var message in recentTurns)
            _history.Add(message);
    }
}

// Per-user session management — inject as Scoped in ASP.NET Core:
builder.Services.AddScoped<ClinicalCopilotSession>();

Short-Term Memory: Redis Session Store

C#
// For multi-server deployments, store chat history in Redis
// so any server can resume the user's session

public sealed class RedisChatHistoryStore
{
    private readonly IDistributedCache _cache;
    private readonly JsonSerializerOptions _json = new()
    {
        PropertyNamingPolicy = JsonNamingPolicy.CamelCase
    };

    public async Task<ChatHistory> LoadHistoryAsync(
        string sessionId, string systemPrompt, CancellationToken ct)
    {
        var key  = $"copilot:history:{sessionId}";
        var data = await _cache.GetStringAsync(key, ct);

        if (data is null)
            return new ChatHistory(systemPrompt);

        var messages = JsonSerializer.Deserialize<List<StoredMessage>>(data, _json)
            ?? new List<StoredMessage>();

        var history = new ChatHistory(systemPrompt);
        foreach (var msg in messages)
        {
            if (msg.Role == "user")      history.AddUserMessage(msg.Content);
            if (msg.Role == "assistant") history.AddAssistantMessage(msg.Content);
        }

        return history;
    }

    public async Task SaveHistoryAsync(
        string sessionId, ChatHistory history, CancellationToken ct)
    {
        var key = $"copilot:history:{sessionId}";
        var messages = history
            .Where(m => m.Role != AuthorRole.System)
            .Select(m => new StoredMessage(
                m.Role.ToString().ToLowerInvariant(),
                m.Content ?? string.Empty))
            .ToList();

        var data = JsonSerializer.Serialize(messages, _json);
        await _cache.SetStringAsync(key, data,
            new DistributedCacheEntryOptions
            {
                SlidingExpiration = TimeSpan.FromHours(4)  // clear after 4h inactivity
            }, ct);
    }

    private sealed record StoredMessage(string Role, string Content);
}

Long-Term Semantic Memory

C#
// Semantic memory: store facts about patients/medications, retrieve by meaning
// Uses vector embeddings — "penicillin allergy" retrieved when asking about "Amoxicillin"
// NuGet: Microsoft.SemanticKernel.Memory, Microsoft.SemanticKernel.Plugins.Memory

// Setup:
var memoryBuilder = new MemoryBuilder();
memoryBuilder
    .WithAzureOpenAITextEmbeddingGeneration(
        deploymentName: "text-embedding-ada-002",
        endpoint:       config["AzureOpenAI:Endpoint"]!,
        apiKey:         config["AzureOpenAI:ApiKey"]!)
    .WithMemoryStore(
        new VolatileMemoryStore());  // replace with SqliteMemoryStore for persistence

var memory = memoryBuilder.Build();

// Store clinical facts:
await memory.SaveInformationAsync(
    collection: $"patient:{mrn}",
    text:       $"Patient {mrn} has a documented penicillin allergy (confirmed 2024-11-03)",
    id:         "allergy-penicillin");

await memory.SaveInformationAsync(
    collection: $"patient:{mrn}",
    text:       $"Patient {mrn} is on long-term warfarin therapy for atrial fibrillation. " +
                $"Target INR range: 2.0–3.0. Regular INR monitoring required.",
    id:         "warfarin-indication");

// Retrieve by semantic similarity:
var results = memory.SearchAsync(
    collection: $"patient:{mrn}",
    query:      "can we prescribe amoxicillin?",
    limit:      3,
    minRelevanceScore: 0.75);

await foreach (var result in results)
{
    Console.WriteLine($"Relevance: {result.Relevance:F2} — {result.Metadata.Text}");
    // Relevance: 0.89 — Patient MRN001 has a documented penicillin allergy...
}

Injecting Long-Term Memory into the Agent

C#
// Retrieve relevant memories and inject them into the system prompt
// before each agent invocation

public sealed class MemoryAwareCopilotService
{
    private readonly IChatCompletionService _chat;
    private readonly Kernel                 _kernel;
    private readonly ISemanticTextMemory    _memory;

    public async Task<string> AskWithMemoryAsync(
        string mrn,
        string question,
        ChatHistory conversationHistory,
        CancellationToken ct)
    {
        // Retrieve patient-specific memories relevant to this question
        var memories = new List<string>();
        await foreach (var result in _memory.SearchAsync(
            collection:        $"patient:{mrn}",
            query:             question,
            limit:             5,
            minRelevanceScore: 0.70,
            cancellationToken: ct))
        {
            memories.Add(result.Metadata.Text);
        }

        // Build context-enriched system message
        var contextSection = memories.Any()
            ? $"\n\nKnown facts about patient {mrn}:\n" +
              string.Join("\n", memories.Select(m => $"- {m}"))
            : string.Empty;

        var enrichedHistory = new ChatHistory(
            BaseSystemPrompt + contextSection);

        // Copy conversation history into enriched context
        foreach (var message in conversationHistory.Where(m => m.Role != AuthorRole.System))
            enrichedHistory.Add(message);

        enrichedHistory.AddUserMessage(question);

        var settings = new OpenAIPromptExecutionSettings
        {
            ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions,
            Temperature      = 0.2
        };

        var response = await _chat.GetChatMessageContentAsync(
            enrichedHistory, settings, _kernel, ct);

        return response.Content ?? string.Empty;
    }

    private const string BaseSystemPrompt = """
        You are a pharmacist assistant for a clinical prescription system.
        Use the known facts section to inform your answers — treat them as verified clinical records.
        Always cite which fact you are using when it influences your response.
        """;
}

Memory Safety for Clinical Systems

Clinical memory requires strict safety guardrails:

1. Memory isolation per patient
   NEVER mix memory collections across patients.
   Collection name MUST include the patient identifier: "patient:{mrn}"
   Never query a cross-patient collection with patient-specific questions.

2. Memory provenance
   Every stored memory must record: source, who recorded it, timestamp
   "Patient MRN-001 has penicillin allergy" without provenance is
   unverifiable — it could have been entered in error.

3. Memory expiry
   Clinical facts can become outdated: INR range changes, allergy disproven.
   Set TTL on memories or flag them for clinical review at renewal intervals.

4. Memory vs. real-time data
   Memory is for context and patient history facts.
   NEVER use memory for current clinical values (INR readings, active prescriptions).
   Always fetch live data via tools for anything that could have changed.

5. Memory confidentiality
   Patient memories contain PHI — encrypt at rest, enforce access control.
   Do not log memory contents in plain text.
   Audit all memory reads for the patient's data access record.

Safe pattern:
  → Tools: current clinical data (always fresh from database)
  → Memory: stable patient facts (allergies, long-term indications, care preferences)
  → Chat history: current conversation context

Production issue I've seen: A clinical AI assistant was built with semantic memory to store clinical notes about patients. A pharmacist entered: "Patient MRN-042 refuses oral medication — prefers IV." This was stored in memory. Six months later, the patient's condition had changed and they could take oral medication — but the memory was never updated. The agent consistently recommended IV formulations for this patient, even when oral was appropriate and preferred. The prescribers didn't know why the AI kept making this recommendation. The fix: memories must have a recordedAt timestamp, a reviewBy date, and a confirmation step when used: "I have a note from 6 months ago that this patient prefers IV. Is this still accurate?"


Key Takeaway

Agent memory has four levels: chat history (in-session), Redis session store (cross-server in-session), long-term vector memory (semantic facts across sessions), and live tool data (always fresh). For clinical systems: use tools for current clinical values (INR, prescriptions), memory only for stable patient facts (allergies, long-term indications). Always scope memory collections per patient, record provenance and timestamps, and surface memory to users before acting on it — stale clinical memory is a patient safety risk. Memory enriches agent context; it never replaces real-time data from authoritative systems.

Enjoyed this article?

Explore the AI Systems learning path for more.

Found this helpful?

Share:𝕏

Leave a comment

Have a question, correction, or just found this helpful? Leave a note below.