SignalR Production Patterns — Scale, Reliability, and Monitoring
Production SignalR: connection lifecycle management, heartbeats, fallback transports, monitoring connection counts, graceful shutdown, and the operational patterns for real-time systems at hospital scale.
Transport Fallback
SignalR negotiates the best available transport:
Transport priority:
1. WebSockets (full duplex, most efficient)
2. Server-Sent Events (server-to-client only)
3. Long Polling (polling with held connection)
Negotiation happens automatically:
Client: "I support WebSockets, SSE, LongPolling"
Server: "I support WebSockets and LongPolling"
Result: WebSockets selected
Skip negotiation for performance:
.withUrl("/hubs/clinical", {
transport: signalR.HttpTransportType.WebSockets,
skipNegotiation: true // requires WebSockets, no fallback
})Heartbeats and Timeouts
// Configure hub timeouts (server-side)
builder.Services.AddSignalR(options =>
{
// How often the server pings connected clients (default: 15s)
options.KeepAliveInterval = TimeSpan.FromSeconds(15);
// How long the server waits for client response before considering disconnected
// (default: 30s — must be greater than KeepAliveInterval)
options.ClientTimeoutInterval = TimeSpan.FromSeconds(30);
// Max message size (default: 32KB)
options.MaximumReceiveMessageSize = 64 * 1024; // 64KB
// Max parallel hub invocations per connection (default: 1)
options.MaximumParallelInvocationsPerClient = 1;
});// Client-side: server timeout for heartbeat responses
const connection = new signalR.HubConnectionBuilder()
.withUrl("/hubs/clinical", { accessTokenFactory: getAccessToken })
.withAutomaticReconnect()
.build();
// SignalR client has built-in keep-alive (ping every 15s by default)
connection.serverTimeoutInMilliseconds = 30000; // 30s
connection.keepAliveIntervalInMilliseconds = 15000; // 15sGraceful Shutdown
// Tell SignalR to gracefully close connections before the app stops
builder.Services.AddSignalR();
builder.Services.Configure<HostOptions>(options =>
{
// Give connections time to close before forceful shutdown
options.ShutdownTimeout = TimeSpan.FromSeconds(30);
});
// The hub sends a close message to clients before the connection ends
// Clients receive onclose() and can show "server shutting down, please refresh"Connection Count Monitoring
// Track total connections via IConnectionTracker
public sealed class ConnectionCountMetrics : IHostedService, IDisposable
{
private readonly IConnectionTracker _tracker;
private readonly IMeterFactory _meters;
private Timer? _timer;
public ConnectionCountMetrics(
IConnectionTracker tracker, IMeterFactory meters)
{
_tracker = tracker;
var meter = meters.Create("SystemForge.SignalR");
meter.CreateObservableGauge(
"signalr.connections.active",
() => _tracker.TotalConnections,
description: "Active SignalR connections");
}
public Task StartAsync(CancellationToken ct)
{
_timer = new Timer(
_ => /* report metrics */ _tracker.TotalConnections,
null, TimeSpan.Zero, TimeSpan.FromSeconds(30));
return Task.CompletedTask;
}
public Task StopAsync(CancellationToken ct)
{
_timer?.Change(Timeout.Infinite, 0);
return Task.CompletedTask;
}
public void Dispose() => _timer?.Dispose();
}Error Handling in Hub Methods
public sealed class ClinicalDashboardHub : Hub<IClinicalDashboardClient>
{
[Authorize]
public async Task UpdateDrugOrder(Guid orderId, string newStatus)
{
try
{
var result = await _service.UpdateStatusAsync(orderId, newStatus);
if (result.IsFailure)
{
// HubException message is sent to the client
throw new HubException($"Order update failed: {result.Error.Description}");
}
// Notify ward on success
await Clients.Group($"ward:{result.Value.WardId}")
.DrugOrderStatusChanged(result.Value.ToDto());
}
catch (HubException)
{
throw; // re-throw HubException — it goes to the client
}
catch (Exception ex)
{
// Log but don't expose internal error to client
_logger.LogError(ex, "Unhandled error updating drug order {OrderId}", orderId);
throw new HubException("An unexpected error occurred. Please try again.");
}
}
}Message Size and Throttling
// Prevent large messages from overwhelming the hub
builder.Services.AddSignalR(options =>
{
// Reject messages larger than 32KB
options.MaximumReceiveMessageSize = 32 * 1024; // 32KB
});
// For large payloads (e.g., PDF reports), send the URL not the content
// Client fetches via HTTP — SignalR is for notifications, not file transfer
await Clients.Caller.LargeReportReady(new { ReportUrl = "/reports/abc123" });Per-Connection Rate Limiting
// Custom filter to rate-limit hub method calls per connection
public sealed class HubRateLimitFilter : IHubFilter
{
private readonly ConcurrentDictionary<string, RateLimitState> _state = new();
public async ValueTask<object?> InvokeMethodAsync(
HubInvocationContext ctx,
Func<HubInvocationContext, ValueTask<object?>> next)
{
var connectionId = ctx.Context.ConnectionId;
var state = _state.GetOrAdd(connectionId, _ => new RateLimitState());
if (state.IsRateLimited())
throw new HubException("Rate limit exceeded. Please slow down.");
state.RecordCall();
return await next(ctx);
}
}
// Register
builder.Services.AddSignalR(options =>
options.AddFilter<HubRateLimitFilter>());Health Checks for SignalR
// Check that the hub is accepting connections
builder.Services.AddHealthChecks()
.AddCheck("signalr", () =>
{
var count = _connectionTracker.TotalConnections;
return HealthCheckResult.Healthy($"{count} active connections");
})
.AddRedis(redisConnectionString, name: "signalr-backplane");
app.MapHealthChecks("/health/signalr", new HealthCheckOptions
{
Predicate = check => check.Name.StartsWith("signalr"),
ResponseWriter = UIResponseWriter.WriteHealthCheckUIResponse
});Production issue I've seen: A hospital's ward monitoring system had no heartbeat timeout configured. When a nurse's laptop went to sleep, the WebSocket connection stayed "open" from the server's perspective for 4 hours (until the OS forcibly closed it). With 200 nurses, the server accumulated thousands of zombie connections. Setting
ClientTimeoutIntervalto 30 seconds cleaned up stale connections promptly, reducing memory usage by 40%.
Deployment Checklist
Pre-deployment SignalR checklist:
☐ Redis backplane configured for 2+ instances
☐ Load balancer supports WebSocket upgrade headers
☐ KeepAliveInterval and ClientTimeoutInterval configured
☐ MaximumReceiveMessageSize appropriate for your payloads
☐ Hub methods handle errors with HubException
☐ Clients re-join groups on reconnect
☐ Connection count monitoring in place
☐ Redis health check configured
☐ Graceful shutdown timeout set
☐ CORS configured for hub paths
☐ JWT extracted from query string for hub pathsKey Takeaway
Production SignalR requires more than just
AddSignalR(): configure heartbeat timeouts to clean up zombie connections, Redis backplane for multi-instance consistency, graceful shutdown for clean client disconnects, and monitoring for connection count and backplane health. Handle hub method errors withHubException(client-visible) vsException(server-side only). Clients must re-join groups after reconnect — groups are per-connection and reset on reconnect.
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.