Microsoft.Extensions.AI — The .NET AI Abstraction Layer

Microsoft.Extensions.AI is the official .NET abstraction for AI services. It gives you a single interface (IChatClient, IEmbeddingGenerator) that works with any provider — OpenAI, Azure OpenAI, Ollama, Anthropic — so switching providers requires changing one line of DI registration, not rewriting business logic.

Why This Matters

Without the abstraction:
  // Tied to OpenAI SDK directly
  var client = new OpenAIClient(apiKey);
  var chat   = client.GetChatClient("gpt-4o");
  var result = await chat.CompleteChatAsync(messages);
  // Switching to Azure OpenAI = rewrite all call sites

With Microsoft.Extensions.AI:
  // Business code depends on IChatClient (interface)
  var response = await chatClient.CompleteAsync(messages);
  // Switch providers by changing ONE line in Program.cs

Step 1: Install Packages

XML

<!-- Core abstraction -->
<PackageReference Include="Microsoft.Extensions.AI" Version="9.*" />

<!-- Provider implementations (pick one or more) -->
<PackageReference Include="Microsoft.Extensions.AI.OpenAI"      Version="9.*" />
<PackageReference Include="Microsoft.Extensions.AI.AzureAIInference" Version="9.*" />
<PackageReference Include="Microsoft.Extensions.AI.Ollama"      Version="9.*" />

Step 2: Register IChatClient in DI

// Program.cs — choose your provider here, nowhere else

// Option A: OpenAI
builder.Services.AddChatClient(services =>
    new OpenAIClient(builder.Configuration["OpenAI:ApiKey"]!)
        .AsChatClient("gpt-4o"));

// Option B: Azure OpenAI
builder.Services.AddChatClient(services =>
    new AzureOpenAIClient(
        new Uri(builder.Configuration["AzureOpenAI:Endpoint"]!),
        new DefaultAzureCredential())
        .AsChatClient("gpt-4o"));

// Option C: Ollama (local, free)
builder.Services.AddChatClient(services =>
    new OllamaChatClient(new Uri("http://localhost:11434"), "llama3.2"));

// The rest of the application uses IChatClient — provider-agnostic

Step 3: Use IChatClient in Services

// Inject and use — same code regardless of provider
public class OrderSummaryService(IChatClient chatClient)
{
    public async Task<string> SummariseOrderAsync(Order order, CancellationToken ct)
    {
        var messages = new List<ChatMessage>
        {
            new(ChatRole.System, "You are an order analyst. Be concise."),
            new(ChatRole.User,   $"Summarise this order in one sentence: {JsonSerializer.Serialize(order)}"),
        };

        var response = await chatClient.CompleteAsync(messages, cancellationToken: ct);
        return response.Message.Text ?? "";
    }

    // Streaming response
    public async IAsyncEnumerable<string> StreamSummaryAsync(
        Order order,
        [EnumeratorCancellation] CancellationToken ct)
    {
        var messages = new List<ChatMessage>
        {
            new(ChatRole.System, "You are an order analyst."),
            new(ChatRole.User,   $"Analyse this order: {JsonSerializer.Serialize(order)}"),
        };

        await foreach (var update in chatClient.CompleteStreamingAsync(messages, cancellationToken: ct))
        {
            if (update.Text is not null)
                yield return update.Text;
        }
    }
}

Step 4: Middleware Pipeline

IChatClient supports a middleware pipeline — add cross-cutting concerns without touching business code.

builder.Services.AddChatClient(services =>
    new OpenAIClient(apiKey).AsChatClient("gpt-4o"))
    // Middleware is applied in registration order (outermost first)
    .UseLogging()              // logs every request/response
    .UseOpenTelemetry()        // emits traces and metrics
    .UseFunctionInvocation()   // enables tool/function calling
    .UseDistributedCache();    // caches identical prompts in Redis

// Custom middleware — e.g., inject tenant context into every request
public class TenantContextMiddleware(IChatClient inner, ITenantContext tenant)
    : DelegatingChatClient(inner)
{
    public override async Task<ChatCompletion> CompleteAsync(
        IList<ChatMessage> messages,
        ChatOptions? options = null,
        CancellationToken ct = default)
    {
        // Inject tenant info into the system message
        messages.Insert(0, new ChatMessage(ChatRole.System,
            $"You are serving tenant: {tenant.TenantId}. Respond accordingly."));

        return await base.CompleteAsync(messages, options, ct);
    }
}

// Register custom middleware
builder.Services.AddChatClient(services =>
    new OpenAIClient(apiKey).AsChatClient("gpt-4o"))
    .Use((inner, services) =>
        new TenantContextMiddleware(inner, services.GetRequiredService<ITenantContext>()));

Step 5: IEmbeddingGenerator

// Register embedding generator
builder.Services.AddEmbeddingGenerator<string, Embedding<float>>(services =>
    new OpenAIClient(apiKey).AsEmbeddingGenerator("text-embedding-3-small"));

// Use in a service
public class SemanticSearchService(IEmbeddingGenerator<string, Embedding<float>> embedder)
{
    public async Task<float[]> GetEmbeddingAsync(string text, CancellationToken ct)
    {
        var result = await embedder.GenerateAsync([text], cancellationToken: ct);
        return result[0].Vector.ToArray();
    }

    public async Task<List<float[]>> GetBatchEmbeddingsAsync(
        IEnumerable<string> texts, CancellationToken ct)
    {
        var results = await embedder.GenerateAsync(texts.ToList(), cancellationToken: ct);
        return results.Select(e => e.Vector.ToArray()).ToList();
    }
}

Step 6: Structured Output (JSON mode)

// Get strongly-typed responses instead of raw strings
public record OrderClassification(
    string Category,        // "Electronics", "Clothing", "Food"
    string Urgency,         // "Normal", "Urgent", "Critical"
    double ConfidenceScore);

public class OrderClassifier(IChatClient chatClient)
{
    public async Task<OrderClassification> ClassifyAsync(Order order, CancellationToken ct)
    {
        var messages = new List<ChatMessage>
        {
            new(ChatRole.System, """
                Classify the order and respond with JSON only.
                Schema: { "category": string, "urgency": string, "confidenceScore": number }
                """),
            new(ChatRole.User, JsonSerializer.Serialize(order)),
        };

        var options = new ChatOptions
        {
            ResponseFormat = ChatResponseFormat.Json,
        };

        var response = await chatClient.CompleteAsync(messages, options, ct);
        return JsonSerializer.Deserialize<OrderClassification>(response.Message.Text!)!;
    }
}

Step 7: Tool / Function Calling

// Define tools as C# methods — the framework generates the schema
[Description("Get the current status and location of an order")]
static async Task<string> GetOrderStatus(
    [Description("The order ID to look up")] int orderId,
    IOrderRepository repo)
{
    var order = await repo.GetByIdAsync(orderId, CancellationToken.None);
    return order is null ? "Order not found" : $"Status: {order.Status}, Location: {order.Location}";
}

// Register tools on ChatOptions
var options = new ChatOptions
{
    Tools = [AIFunctionFactory.Create(GetOrderStatus)],
};

// With UseFunctionInvocation() middleware, tools are auto-invoked
var response = await chatClient.CompleteAsync(
    [new ChatMessage(ChatRole.User, "Where is order 42?")],
    options, ct);
// The framework calls GetOrderStatus(42), then sends result back to the LLM

Caching AI Responses

// UseDistributedCache() caches responses for identical prompts
builder.Services.AddChatClient(services =>
    new OpenAIClient(apiKey).AsChatClient("gpt-4o"))
    .UseDistributedCache(options =>
    {
        options.ModelId = true;   // include model in cache key
    });

// Redis as the backing store (set up with AddStackExchangeRedisCache)
builder.Services.AddStackExchangeRedisCache(opts =>
    opts.Configuration = builder.Configuration.GetConnectionString("Redis"));

Testing with a Fake IChatClient

// Inject a fake for testing — no API calls, no cost
public class FakeChatClient : IChatClient
{
    private readonly string _response;
    public FakeChatClient(string response) => _response = response;

    public Task<ChatCompletion> CompleteAsync(
        IList<ChatMessage> messages, ChatOptions? options = null, CancellationToken ct = default)
        => Task.FromResult(new ChatCompletion(new ChatMessage(ChatRole.Assistant, _response)));

    public IAsyncEnumerable<StreamingChatCompletionUpdate> CompleteStreamingAsync(
        IList<ChatMessage> messages, ChatOptions? options = null, CancellationToken ct = default)
        => throw new NotImplementedException();

    public ChatClientMetadata Metadata => new("fake", null, "fake-model");
    public TService? GetService<TService>(object? key = null) where TService : class => null;
    public void Dispose() { }
}

// In test
var chatClient = new FakeChatClient("Order 42 is on its way — estimated delivery tomorrow.");
var service    = new OrderSummaryService(chatClient);
var summary    = await service.SummariseOrderAsync(testOrder, CancellationToken.None);
summary.Should().Contain("42");

Interview Answer

"Microsoft.Extensions.AI is the official .NET abstraction for AI — it defines IChatClient and IEmbeddingGenerator interfaces that any provider can implement. You register the provider once in DI (OpenAI, Azure OpenAI, Ollama) and all business code depends only on the interface. The middleware pipeline adds logging, OpenTelemetry, function calling, and response caching with single method calls — no manual wiring. For structured output, set ResponseFormat to Json and deserialise the response. For tool calling, use AIFunctionFactory.Create with C# methods decorated with [Description] attributes, and add UseFunctionInvocation() to auto-invoke tools during completions. For testing: inject a FakeChatClient — no API calls, deterministic, fast. This is the recommended approach for all new .NET AI applications as of .NET 9."

Microsoft.Extensions.AI — The .NET AI Abstraction Layer

Microsoft.Extensions.AI — The .NET AI Abstraction Layer

Why This Matters

Step 1: Install Packages

Step 2: Register IChatClient in DI

Step 3: Use IChatClient in Services

Step 4: Middleware Pipeline

Step 5: IEmbeddingGenerator

Step 6: Structured Output (JSON mode)

Step 7: Tool / Function Calling

Caching AI Responses

Testing with a Fake IChatClient

Interview Answer

Enjoyed this article?

Leave a comment