.NET & C# Development · Lesson 212 of 229
System Design: Multi-Tenant SaaS in .NET — Isolation, Onboarding, Billing, and Per-Tenant Customisation
System Design: Multi-Tenant SaaS in .NET — Isolation, Onboarding, Billing, and Per-Tenant Customisation
System: B2B SaaS platform serving 1,200 business tenants, 100–5,000 users per tenant, regulated industries (legal, finance, healthcare). Some tenants require GDPR data residency in specific regions; others require complete database isolation for compliance audits.
The core challenge: No single isolation model fits all tenants. A startup on a $49/month plan needs cost-efficient row-level isolation. An enterprise on a $20K/month contract needs a dedicated database and the ability to hand a DB connection string to their auditors. The platform must support both without requiring two separate codebases.
Isolation Strategy Decision Matrix
| Strategy | Data isolation | Cost | Compliance | Schema migration | |---|---|---|---|---| | Row-level (shared schema) | Low (filter bugs) | Lowest | Difficult | One migration | | Schema-per-tenant | Medium | Low | Possible | Per-tenant migration | | Database-per-tenant | High | High | Excellent | Per-tenant migration |
We implement all three and choose per tenant tier:
- Starter plan: row-level isolation (shared database, shared schema,
tenant_idcolumn everywhere) - Professional plan: schema-per-tenant (same PostgreSQL instance,
SET search_path) - Enterprise plan: database-per-tenant (dedicated PostgreSQL instance or Azure Database)
Core Data Model
public class Tenant
{
public Guid Id { get; private set; }
public string Slug { get; private set; } = ""; // URL-safe identifier: "acme-corp"
public string DisplayName { get; private set; } = "";
public TenantTier Tier { get; private set; }
public TenantStatus Status { get; private set; }
public string Region { get; private set; } = "eu-west"; // data residency
public string? ConnectionString { get; private set; } // null for shared DB tenants
public string? SchemaName { get; private set; } // null for row-level tenants
public enum TenantTier { Starter, Professional, Enterprise }
public enum TenantStatus { Provisioning, Active, Suspended, Cancelled }
}
public class TenantConfiguration
{
public Guid TenantId { get; private set; }
public string Key { get; private set; } = "";
public string Value { get; private set; } = "";
public string? Description { get; private set; }
// e.g. "max_users" = "50", "logo_url" = "...", "primary_colour" = "#3B82F6"
}
public class TenantFeatureFlag
{
public Guid TenantId { get; private set; }
public string FeatureKey { get; private set; } = "";
public bool Enabled { get; private set; }
public DateTime? EnabledUntil { get; private set; } // temporary flags for trials
}Design Decision 1: Tenant Resolution and Context
Every request must resolve the tenant before any data access. The resolution strategy depends on how tenants are identified:
public interface ITenantResolver
{
Task<TenantContext?> ResolveAsync(HttpContext context);
}
// Subdomain: acme-corp.myapp.com
public class SubdomainTenantResolver : ITenantResolver
{
private readonly ITenantRepository _tenants;
private readonly IDatabase _redis;
public async Task<TenantContext?> ResolveAsync(HttpContext context)
{
var host = context.Request.Host.Host;
var slug = host.Split('.')[0];
if (slug is "www" or "api" or "app") return null; // platform routes
// Cache tenant lookups to avoid DB hit on every request
var cacheKey = $"tenant:slug:{slug}";
var cached = await _redis.StringGetAsync(cacheKey);
if (cached.HasValue)
return JsonSerializer.Deserialize<TenantContext>(cached!);
var tenant = await _tenants.GetBySlugAsync(slug);
if (tenant is null) return null;
var ctx = TenantContext.From(tenant);
await _redis.StringSetAsync(cacheKey, JsonSerializer.Serialize(ctx), TimeSpan.FromMinutes(5));
return ctx;
}
}
// Also support: JWT claim, API key header, custom header X-Tenant-Id
public class TenantResolutionMiddleware
{
private readonly RequestDelegate _next;
private readonly IEnumerable<ITenantResolver> _resolvers;
public async Task InvokeAsync(HttpContext context)
{
TenantContext? tenant = null;
foreach (var resolver in _resolvers)
{
tenant = await resolver.ResolveAsync(context);
if (tenant is not null) break;
}
if (tenant is null && RequiresTenant(context.Request.Path))
{
context.Response.StatusCode = 404;
return;
}
context.Items["TenantContext"] = tenant;
TenantContextAccessor.Current = tenant;
await _next(context);
}
}
// AsyncLocal so the tenant flows with async operations without passing it explicitly
public static class TenantContextAccessor
{
private static readonly AsyncLocal<TenantContext?> _current = new();
public static TenantContext? Current
{
get => _current.Value;
set => _current.Value = value;
}
}Design Decision 2: Row-Level Isolation with EF Core
For Starter tenants, every table has a tenant_id column. A global query filter on every entity ensures queries never return cross-tenant data — even if a developer forgets to add the filter manually.
public class TenantedDbContext : DbContext
{
private readonly ITenantContextAccessor _tenantAccessor;
public TenantedDbContext(
DbContextOptions options,
ITenantContextAccessor tenantAccessor) : base(options)
{
_tenantAccessor = tenantAccessor;
}
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
// Apply tenant filter to every entity implementing ITenanted
foreach (var entityType in modelBuilder.Model.GetEntityTypes())
{
if (typeof(ITenanted).IsAssignableFrom(entityType.ClrType))
{
var method = typeof(TenantedDbContext)
.GetMethod(nameof(ApplyTenantFilter), BindingFlags.NonPublic | BindingFlags.Static)!
.MakeGenericMethod(entityType.ClrType);
method.Invoke(null, [modelBuilder, this]);
}
}
base.OnModelCreating(modelBuilder);
}
private static void ApplyTenantFilter<T>(ModelBuilder builder, TenantedDbContext ctx)
where T : class, ITenanted
{
builder.Entity<T>().HasQueryFilter(e =>
e.TenantId == ctx._tenantAccessor.Current!.TenantId);
}
public override int SaveChanges()
{
StampTenantId();
return base.SaveChanges();
}
public override Task<int> SaveChangesAsync(CancellationToken ct = default)
{
StampTenantId();
return base.SaveChangesAsync(ct);
}
private void StampTenantId()
{
var tenantId = _tenantAccessor.Current?.TenantId
?? throw new InvalidOperationException("No tenant context for write operation");
foreach (var entry in ChangeTracker.Entries<ITenanted>()
.Where(e => e.State == EntityState.Added))
{
entry.Entity.TenantId = tenantId;
}
}
}
public interface ITenanted
{
Guid TenantId { get; set; }
}Design Decision 3: Schema-Per-Tenant for Professional Tier
Professional tenants get their own PostgreSQL schema. The DbContext sets the search_path on connection open, routing all queries to the tenant's schema without changing the application code.
public class SchemaRoutingDbContextFactory
{
private readonly string _sharedConnectionString;
private readonly ITenantContextAccessor _tenantAccessor;
public AppDbContext Create()
{
var tenant = _tenantAccessor.Current
?? throw new InvalidOperationException("No tenant context");
var schemaName = tenant.SchemaName
?? throw new InvalidOperationException("Tenant has no schema assigned");
var options = new DbContextOptionsBuilder<AppDbContext>()
.UseNpgsql(
_sharedConnectionString,
npgsql => npgsql.CommandTimeout(30))
.AddInterceptors(new SchemaRoutingInterceptor(schemaName))
.Options;
return new AppDbContext(options);
}
}
public class SchemaRoutingInterceptor : DbConnectionInterceptor
{
private readonly string _schema;
public SchemaRoutingInterceptor(string schema) => _schema = schema;
public override async Task ConnectionOpenedAsync(
DbConnection connection,
ConnectionEndEventData eventData,
CancellationToken ct)
{
await using var cmd = connection.CreateCommand();
cmd.CommandText = $"SET search_path TO {_schema}, public";
await cmd.ExecuteNonQueryAsync(ct);
}
}Design Decision 4: Database-Per-Tenant for Enterprise Tier
Enterprise tenants have a fully separate PostgreSQL instance. The platform stores the connection string in Azure Key Vault per tenant and retrieves it at runtime.
public class EnterpriseTenantDbContextFactory
{
private readonly SecretClient _keyVault;
private readonly ConcurrentDictionary<Guid, string> _connectionCache = new();
public async Task<AppDbContext> CreateAsync(TenantContext tenant)
{
var connectionString = await GetConnectionStringAsync(tenant.TenantId);
var options = new DbContextOptionsBuilder<AppDbContext>()
.UseNpgsql(connectionString)
.Options;
return new AppDbContext(options);
}
private async Task<string> GetConnectionStringAsync(Guid tenantId)
{
// Cache in-process: connection strings don't change frequently
if (_connectionCache.TryGetValue(tenantId, out var cached))
return cached;
var secretName = $"tenant-{tenantId}-db-connection";
var response = await _keyVault.GetSecretAsync(secretName);
var connectionString = response.Value.Value;
_connectionCache[tenantId] = connectionString;
return connectionString;
}
}Design Decision 5: Automated Tenant Onboarding
Onboarding a new tenant is a multi-step saga. Each step must be compensatable — a failed email provider setup should not leave a half-provisioned tenant in the database.
public class TenantOnboardingOrchestrator
{
public async Task<OnboardingResult> OnboardAsync(
TenantRegistrationCommand command,
CancellationToken ct)
{
Guid? tenantId = null;
string? slug = null;
try
{
// Step 1: Allocate tenant record
slug = await GenerateUniqueSlugAsync(command.CompanyName, ct);
var tenant = Tenant.Create(slug, command.CompanyName, command.Tier, command.Region);
await _tenants.AddAsync(tenant, ct);
tenantId = tenant.Id;
// Step 2: Provision data isolation layer
switch (command.Tier)
{
case TenantTier.Professional:
var schemaName = $"tenant_{slug.Replace('-', '_')}";
await _schemaProvisioner.CreateSchemaAsync(schemaName, ct);
await _schemaProvisioner.RunMigrationsAsync(schemaName, ct);
await _tenants.AssignSchemaAsync(tenantId.Value, schemaName, ct);
break;
case TenantTier.Enterprise:
var connString = await _dbProvisioner.ProvisionDatabaseAsync(tenant, ct);
await _keyVault.SetSecretAsync($"tenant-{tenantId}-db-connection", connString);
await _tenants.AssignConnectionStringAsync(tenantId.Value, connString, ct);
break;
// Starter: no additional provisioning, row-level isolation via global filter
}
// Step 3: Create Stripe customer and subscription
var stripeCustomerId = await _billing.CreateCustomerAsync(
command.AdminEmail, command.CompanyName, ct);
var subscriptionId = await _billing.CreateSubscriptionAsync(
stripeCustomerId, command.Tier.ToPlanId(), ct);
await _tenants.AssignBillingAsync(tenantId.Value, stripeCustomerId, subscriptionId, ct);
// Step 4: Create admin user
var adminUser = await _users.CreateAdminAsync(
tenantId.Value, command.AdminEmail, command.AdminName, ct);
// Step 5: Send welcome email with setup link
await _email.SendWelcomeAsync(command.AdminEmail, command.AdminName, slug, ct);
// Step 6: Activate tenant
await _tenants.ActivateAsync(tenantId.Value, ct);
return OnboardingResult.Ok(tenantId.Value, slug);
}
catch (Exception ex)
{
_logger.LogError(ex, "Onboarding failed for {CompanyName}", command.CompanyName);
if (tenantId.HasValue)
await CompensateAsync(tenantId.Value, command.Tier, ct);
throw;
}
}
private async Task CompensateAsync(Guid tenantId, TenantTier tier, CancellationToken ct)
{
// Best-effort cleanup — log failures but don't rethrow
try { await _billing.CancelSubscriptionAsync(tenantId, ct); } catch { }
try { await _tenants.MarkProvisioningFailedAsync(tenantId, ct); } catch { }
if (tier == TenantTier.Professional)
try { await _schemaProvisioner.DropSchemaAsync(tenantId, ct); } catch { }
}
}Design Decision 6: Stripe Billing with Plan-Based Feature Gating
Each tier has a Stripe Price ID. Webhooks update the tenant's subscription state in real-time.
[ApiController]
[Route("/webhooks/stripe")]
public class StripeWebhookController : ControllerBase
{
[HttpPost]
public async Task<IActionResult> Handle([FromBody] string json)
{
var stripeEvent = EventUtility.ConstructEvent(
json,
Request.Headers["Stripe-Signature"],
_config["Stripe:WebhookSecret"]);
switch (stripeEvent.Type)
{
case Events.CustomerSubscriptionUpdated:
{
var sub = (Subscription)stripeEvent.Data.Object;
var tenantId = Guid.Parse(sub.Metadata["tenant_id"]);
var newTier = PlanIdToTier(sub.Items.Data[0].Price.Id);
await _tenants.UpdateTierAsync(tenantId, newTier);
// Clear cached tenant context so next request picks up new tier
await _tenantCache.InvalidateAsync(tenantId);
break;
}
case Events.CustomerSubscriptionDeleted:
{
var sub = (Subscription)stripeEvent.Data.Object;
var tenantId = Guid.Parse(sub.Metadata["tenant_id"]);
await _tenants.SuspendAsync(tenantId, reason: "subscription_cancelled");
break;
}
case Events.InvoicePaymentFailed:
{
var invoice = (Invoice)stripeEvent.Data.Object;
var tenantId = Guid.Parse(invoice.Subscription.Metadata["tenant_id"]);
// Grace period: notify admin, don't suspend immediately
await _notifications.SendPaymentFailedAsync(tenantId);
break;
}
}
return Ok();
}
}Design Decision 7: Per-Tenant Feature Flags and Configuration
Tenants need different behaviours: max users, custom branding, early access features. Feature flags are stored per-tenant and cached.
public class TenantFeatureService
{
private readonly ITenantFeatureFlagRepository _repo;
private readonly ITenantConfigurationRepository _config;
private readonly IDatabase _redis;
public async Task<bool> IsEnabledAsync(string featureKey, CancellationToken ct = default)
{
var tenantId = TenantContextAccessor.Current?.TenantId
?? throw new InvalidOperationException("No tenant context");
var cacheKey = $"features:{tenantId}:{featureKey}";
var cached = await _redis.StringGetAsync(cacheKey);
if (cached.HasValue) return (bool)cached;
var flag = await _repo.GetAsync(tenantId, featureKey, ct);
bool enabled = flag?.Enabled == true &&
(flag.EnabledUntil is null || flag.EnabledUntil > DateTime.UtcNow);
await _redis.StringSetAsync(cacheKey, enabled, TimeSpan.FromMinutes(5));
return enabled;
}
public async Task<T?> GetConfigAsync<T>(string key, T? defaultValue = default, CancellationToken ct = default)
{
var tenantId = TenantContextAccessor.Current!.TenantId;
var cacheKey = $"config:{tenantId}:{key}";
var cached = await _redis.StringGetAsync(cacheKey);
if (cached.HasValue)
return JsonSerializer.Deserialize<T>(cached!);
var config = await _config.GetAsync(tenantId, key, ct);
if (config is null) return defaultValue;
var value = JsonSerializer.Deserialize<T>(config.Value);
await _redis.StringSetAsync(cacheKey, config.Value, TimeSpan.FromMinutes(5));
return value;
}
}
// Usage in application code
public class DocumentUploadHandler : IRequestHandler<UploadDocumentCommand, UploadResult>
{
private readonly TenantFeatureService _features;
public async Task<UploadResult> Handle(UploadDocumentCommand cmd, CancellationToken ct)
{
var maxSizeMb = await _features.GetConfigAsync<int>("max_upload_size_mb", defaultValue: 10, ct);
if (cmd.FileSizeBytes > maxSizeMb * 1_048_576)
return UploadResult.Fail($"File exceeds {maxSizeMb}MB limit for your plan");
if (await _features.IsEnabledAsync("ocr_processing", ct))
{
// Enterprise feature: OCR
}
// ... rest of upload logic
return UploadResult.Ok();
}
}Challenge 1: Schema Migrations Across All Tenants
Row-level tenants: one migration runs once. Schema-per-tenant tenants: the same migration must run N times. Enterprise tenants: migrations run against remote databases.
public class TenantMigrationOrchestrator
{
private readonly ITenantRepository _tenants;
private readonly ILogger<TenantMigrationOrchestrator> _logger;
public async Task MigrateAllAsync(CancellationToken ct)
{
// Starter tenants: one shared migration
using var sharedCtx = _sharedDbFactory.Create();
await sharedCtx.Database.MigrateAsync(ct);
_logger.LogInformation("Shared (Starter) schema migrated");
// Professional tenants: migrate each schema
var professionalTenants = await _tenants.GetByTierAsync(TenantTier.Professional, ct);
foreach (var tenant in professionalTenants)
{
try
{
using var ctx = _schemaDbFactory.Create(tenant.SchemaName!);
await ctx.Database.MigrateAsync(ct);
_logger.LogInformation("Schema {Schema} migrated", tenant.SchemaName);
}
catch (Exception ex)
{
// Log but continue — a broken schema migration shouldn't block others
_logger.LogError(ex, "Migration failed for schema {Schema}", tenant.SchemaName);
}
}
// Enterprise tenants: migrate each remote database
var enterpriseTenants = await _tenants.GetByTierAsync(TenantTier.Enterprise, ct);
var migrationTasks = enterpriseTenants.Select(async tenant =>
{
try
{
var ctx = await _enterpriseDbFactory.CreateAsync(tenant);
await ctx.Database.MigrateAsync(ct);
_logger.LogInformation("Enterprise DB for tenant {TenantId} migrated", tenant.Id);
}
catch (Exception ex)
{
_logger.LogError(ex, "Enterprise migration failed for tenant {TenantId}", tenant.Id);
}
});
// Parallelise enterprise migrations — they hit separate databases
await Task.WhenAll(migrationTasks);
}
}Challenge 2: Cross-Tenant Admin Queries
Platform administrators need dashboards showing metrics across all tenants. Global query filters block cross-tenant queries by design — so admin queries must explicitly bypass them.
public class PlatformAdminDbContext : DbContext
{
// No global query filters — this context is only injected into admin services
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
// Entities configured without ITenanted filters
base.OnModelCreating(modelBuilder);
}
}
// Separate admin service, separate DI registration
public class TenantMetricsService
{
private readonly PlatformAdminDbContext _adminCtx;
public async Task<PlatformMetrics> GetMetricsAsync(CancellationToken ct)
{
return new PlatformMetrics(
TotalTenants: await _adminCtx.Tenants.CountAsync(ct),
ActiveTenants: await _adminCtx.Tenants.CountAsync(t => t.Status == TenantStatus.Active, ct),
TotalUsers: await _adminCtx.Users.CountAsync(ct),
MrrGbp: await _adminCtx.Tenants
.Where(t => t.Status == TenantStatus.Active)
.SumAsync(t => t.MonthlyRevenue, ct)
);
}
}Challenge 3: Data Residency
Some tenants require data to stay in the EU. Others require data in the US. The platform routes database connections based on the tenant's region.
public class RegionAwareConnectionFactory
{
private readonly Dictionary<string, string> _regionConnectionStrings;
public RegionAwareConnectionFactory(IConfiguration config)
{
_regionConnectionStrings = new()
{
["eu-west"] = config["Database:EuWest"]!,
["us-east"] = config["Database:UsEast"]!,
["ap-south"] = config["Database:ApSouth"]!,
};
}
public string GetConnectionString(TenantContext tenant)
{
if (tenant.Tier == TenantTier.Enterprise && tenant.ConnectionString is not null)
return tenant.ConnectionString; // enterprise brings their own DB
if (_regionConnectionStrings.TryGetValue(tenant.Region, out var cs))
return cs;
throw new InvalidOperationException($"No database configured for region {tenant.Region}");
}
}Challenge 4: Tenant Suspension and Data Retention
When a tenant cancels, their data must be retained for 90 days (legal requirement), then deleted. Suspension must happen immediately but must not delete data.
public class TenantSuspensionService
{
public async Task SuspendAsync(Guid tenantId, string reason, CancellationToken ct)
{
await _tenants.UpdateStatusAsync(tenantId, TenantStatus.Suspended, reason, ct);
// Invalidate all active sessions for this tenant
await _sessionStore.InvalidateTenantSessionsAsync(tenantId, ct);
// Revoke all API keys
await _apiKeys.RevokeAllAsync(tenantId, ct);
// Cache invalidation — next request will resolve suspended status and reject
await _tenantCache.InvalidateAsync(tenantId);
// Schedule data deletion after retention window
await _scheduler.ScheduleAsync(
new TenantDataDeletionJob(tenantId),
delay: TimeSpan.FromDays(90));
}
}
// In middleware: suspended tenants get 402 or 403, not 404
public class TenantStatusMiddleware
{
public async Task InvokeAsync(HttpContext context)
{
var tenant = TenantContextAccessor.Current;
if (tenant is null)
{
await _next(context);
return;
}
if (tenant.Status == TenantStatus.Suspended)
{
context.Response.StatusCode = 402;
await context.Response.WriteAsJsonAsync(new
{
error = "Account suspended",
detail = "Contact billing@myapp.com to restore access"
});
return;
}
if (tenant.Status == TenantStatus.Cancelled)
{
context.Response.StatusCode = 410; // Gone
return;
}
await _next(context);
}
}What We'd Do Differently
Start with a single isolation model. The ability to support all three isolation tiers in one codebase sounds powerful, but it's months of infrastructure before you have a single paying customer. Start with row-level for everyone, then add schema isolation when an enterprise deal requires it.
Use a tenant provisioning queue, not a synchronous saga. Long-running provisioning (database creation, schema migration) should be queued via MassTransit and the user told "we're setting up your account" with a webhook or email when done. Synchronous provisioning in an HTTP request times out for Enterprise tenants.
Centralise tenant configuration in a single service. When feature flags and config values are scattered across database tables, Redis, environment variables, and code, debugging "why does this tenant see behaviour X" becomes impossible. One TenantConfigurationService that merges all sources, with a trace log, pays dividends immediately.
Test with tenant isolation as a first-class concern. Write integration tests that create two tenants and assert that tenant A's data is never visible to tenant B. A missing global query filter is a data leak — unit tests will not catch it.