.NET & C# Development · Lesson 206 of 229
System Design: Social Media Feed in .NET — Fan-Out, Celebrity Accounts, and Hybrid Feed Strategies
System Design: Instagram-Style Social Platform in .NET
Social media platforms are among the hardest systems to scale correctly. The core challenge is deceptively simple: when a user posts, their followers should see it. But "followers" can mean 200 people or 50 million people. A single architectural decision — push to everyone on write vs. pull on read — looks fine at 10,000 users and catastrophic at 10 million.
This case study walks through designing an Instagram-style platform in .NET 9. We cover the hybrid fanout strategy (push for regular users, pull for celebrities), the follow graph in Redis, a complete media upload pipeline through Azure Blob Storage and CDN, real-time notifications via SignalR, and two production challenges: write spikes from celebrity posts and deleted posts surfacing in feeds. All design decisions are explained with the tradeoffs that drove them.
System Overview
The platform has four primary flows:
1. Post creation:
User → API → Media Pipeline → Post stored → Feed Fanout (async)
2. Feed read:
User → API → FeedService → Redis ZSET (own feed) + pull from celebrities → merge → return
3. Media upload:
Client → PresignedURL endpoint → Azure Blob (direct upload) →
Blob trigger → Azure Function → resize/compress → CDN propagation
4. Notifications:
Like/Comment → SignalR hub (real-time) + Outbox → Email digest worker (async)Key infrastructure choices:
- SQL Server (EF Core) for durable, structured data: posts, users, follows, media metadata
- Redis for ephemeral high-frequency data: feed lists, follow graph, online presence
- Azure Blob Storage for raw and processed media
- Azure CDN for media delivery
- MassTransit + RabbitMQ for async fanout and notifications
- SignalR for real-time push to connected clients
Data Model
public class User
{
public Guid Id { get; private set; }
public string Username { get; private set; } = default!;
public string DisplayName { get; private set; } = default!;
public int FollowerCount { get; private set; } // denormalized counter
public bool IsCelebrity => FollowerCount >= 10_000;
public DateTimeOffset CreatedAt { get; private set; }
}
public class Post
{
public Guid Id { get; private set; }
public Guid AuthorId { get; private set; }
public string Caption { get; private set; } = default!;
public List<MediaAsset> Media { get; private set; } = [];
public int LikeCount { get; private set; } // denormalized
public int CommentCount { get; private set; } // denormalized
public DateTimeOffset CreatedAt { get; private set; }
public DateTimeOffset? DeletedAt { get; private set; } // soft delete
public bool IsDeleted => DeletedAt.HasValue;
public static Post Create(Guid authorId, string caption, List<Guid> mediaAssetIds)
{
var post = new Post
{
Id = Guid.NewGuid(),
AuthorId = authorId,
Caption = caption,
CreatedAt = DateTimeOffset.UtcNow
};
post.Media = mediaAssetIds
.Select(id => new MediaAsset { Id = id, PostId = post.Id })
.ToList();
return post;
}
public void SoftDelete()
{
DeletedAt = DateTimeOffset.UtcNow;
}
}
// Denormalized feed item — one row per follower per post (for push fanout)
public class FeedItem
{
public long Id { get; set; } // auto-increment for ordering
public Guid RecipientUserId { get; set; }
public Guid PostId { get; set; }
public Guid AuthorId { get; set; }
public DateTimeOffset PostedAt { get; set; }
public bool IsFromCelebrity { get; set; } // stored so pull-merge skips push rows
}
// Follow edge
public class Follow
{
public Guid FollowerId { get; set; }
public Guid FolloweeId { get; set; }
public DateTimeOffset CreatedAt { get; set; }
}
// Media asset — metadata in SQL, bytes in Azure Blob
public class MediaAsset
{
public Guid Id { get; set; }
public Guid PostId { get; set; }
public string BlobPath { get; set; } = default!; // raw upload path
public string? ProcessedBlobPath { get; set; } // resized/compressed
public string? CdnUrl { get; set; } // populated after CDN propagation
public MediaProcessingStatus ProcessingStatus { get; set; }
public int? WidthPx { get; set; }
public int? HeightPx { get; set; }
}
public enum MediaProcessingStatus
{
Pending,
Processing,
Ready,
Failed
}The FeedItem table is intentionally denormalized. We store AuthorId and PostedAt directly so the feed read path can skip joins. The tradeoff: writes to the feed table are proportional to follower count. For users with 200 followers this is 200 rows. For users with 50,000 followers this is 50,000 rows — which is why celebrities use a different strategy.
The FollowerCount denormalized counter on User is the gate for the celebrity threshold. We chose 10,000 as the inflection point based on profiling: at that follower count, synchronous fanout on write adds more than 500ms of latency to a post request, which is unacceptable.
Key Design Decisions
1. Fanout-on-Write (Push) for Regular Users
When a non-celebrity posts, we write a FeedItem row for every follower immediately (via a background job — the HTTP response returns before fanout completes):
public class FanoutOnWriteConsumer : IConsumer<PostCreatedEvent>
{
private readonly SocialDbContext _db;
private readonly IConnectionMultiplexer _redis;
private readonly ILogger<FanoutOnWriteConsumer> _logger;
public FanoutOnWriteConsumer(
SocialDbContext db,
IConnectionMultiplexer redis,
ILogger<FanoutOnWriteConsumer> logger)
{
_db = db;
_redis = redis;
_logger = logger;
}
public async Task Consume(ConsumeContext<PostCreatedEvent> context)
{
var msg = context.Message;
// Load all follower IDs (batched to avoid loading millions into memory)
var followerIds = await _db.Follows
.AsNoTracking()
.Where(f => f.FolloweeId == msg.AuthorId)
.Select(f => f.FollowerId)
.ToListAsync(context.CancellationToken);
if (followerIds.Count == 0) return;
// Write feed items in SQL batches of 1000
var feedItems = followerIds.Select(fid => new FeedItem
{
RecipientUserId = fid,
PostId = msg.PostId,
AuthorId = msg.AuthorId,
PostedAt = msg.CreatedAt,
IsFromCelebrity = false
}).ToList();
// Bulk insert via EF Core
await _db.FeedItems.AddRangeAsync(feedItems, context.CancellationToken);
await _db.SaveChangesAsync(context.CancellationToken);
// Also push post ID to each follower's Redis feed ZSET
// Score = Unix timestamp for chronological ordering
var db = _redis.GetDatabase();
var score = msg.CreatedAt.ToUnixTimeMilliseconds();
// Pipeline Redis writes for efficiency
var pipeline = db.CreateBatch();
var tasks = followerIds.Select(fid =>
{
var feedKey = $"feed:{fid}";
// Keep feed ZSET capped at 1000 entries — older entries fall off
var addTask = pipeline.SortedSetAddAsync(feedKey, msg.PostId.ToString(), score);
var trimTask = pipeline.SortedSetRemoveRangeByRankAsync(feedKey, 0, -1001);
return Task.WhenAll(addTask, trimTask);
});
pipeline.Execute();
await Task.WhenAll(tasks);
_logger.LogInformation("Fanout complete: post {PostId} pushed to {Count} followers.",
msg.PostId, followerIds.Count);
}
}The Redis ZSET per user (key: feed:{userId}) stores post IDs with a timestamp score. This lets the feed reader retrieve the latest N posts in O(log N + M) time. The cap of 1,000 entries prevents unbounded growth — users who haven't opened the app in months get a truncated feed, which is acceptable.
2. Fanout-on-Read (Pull) for Celebrity Posts
When a user opens their feed, the service checks which of their followees are celebrities. For each celebrity it has not fetched recently, it queries that celebrity's recent posts directly from SQL (not from a pre-built feed list) and merges them at read time:
public class CelebrityFeedPuller
{
private readonly SocialDbContext _db;
private readonly IConnectionMultiplexer _redis;
public CelebrityFeedPuller(SocialDbContext db, IConnectionMultiplexer redis)
{
_db = db;
_redis = redis;
}
public async Task<List<FeedPostDto>> PullCelebrityPostsAsync(
Guid viewerUserId,
DateTimeOffset since,
int limit = 20,
CancellationToken ct = default)
{
// Get list of celebrities this user follows (cached in Redis SET)
var db = _redis.GetDatabase();
var celebKey = $"following:celebrities:{viewerUserId}";
var celebIds = await db.SetMembersAsync(celebKey);
if (celebIds.Length == 0)
return [];
var celebGuids = celebIds
.Select(c => Guid.Parse(c.ToString()))
.ToList();
// Pull their recent posts directly from SQL
return await _db.Posts
.AsNoTracking()
.Where(p => celebGuids.Contains(p.AuthorId)
&& p.CreatedAt >= since
&& !p.IsDeleted)
.OrderByDescending(p => p.CreatedAt)
.Take(limit)
.Select(p => new FeedPostDto(
p.Id,
p.AuthorId,
p.Caption,
p.Media.Select(m => m.CdnUrl ?? string.Empty).ToList(),
p.LikeCount,
p.CreatedAt))
.ToListAsync(ct);
}
}Why not push celebrity posts at all? Because a single post from a user with 50 million followers would require 50 million Redis writes and 50 million SQL inserts. At even 1 MB/s throughput that takes over 13 hours to complete. By the time fanout finishes, the post is stale. Pull is the only practical model at celebrity scale.
3. Hybrid Merge at Read Time
The FeedService combines both strategies. It reads the viewer's pre-built push feed (from Redis ZSET), hydrates the post details from SQL, then merges in celebrity posts pulled on demand:
public class FeedService
{
private readonly SocialDbContext _db;
private readonly IConnectionMultiplexer _redis;
private readonly CelebrityFeedPuller _celebPuller;
public FeedService(
SocialDbContext db,
IConnectionMultiplexer redis,
CelebrityFeedPuller celebPuller)
{
_db = db;
_redis = redis;
_celebPuller = celebPuller;
}
public async Task<FeedResponse> GetFeedAsync(
Guid userId,
int page = 0,
int pageSize = 20,
CancellationToken ct = default)
{
var redisDb = _redis.GetDatabase();
var feedKey = $"feed:{userId}";
// Read post IDs from user's push-feed ZSET (newest first)
var postEntries = await redisDb.SortedSetRangeByRankWithScoresAsync(
feedKey,
start: page * pageSize,
stop: (page * pageSize) + pageSize - 1,
order: Order.Descending);
var pushPostIds = postEntries
.Select(e => Guid.Parse(e.Element.ToString()))
.ToList();
// Hydrate push posts from SQL in one query
var pushPosts = pushPostIds.Count > 0
? await _db.Posts
.AsNoTracking()
.Where(p => pushPostIds.Contains(p.Id) && !p.IsDeleted)
.Include(p => p.Media)
.Select(p => new FeedPostDto(
p.Id, p.AuthorId, p.Caption,
p.Media.Select(m => m.CdnUrl ?? string.Empty).ToList(),
p.LikeCount, p.CreatedAt))
.ToListAsync(ct)
: [];
// Pull celebrity posts for the current window
var windowStart = page == 0
? DateTimeOffset.UtcNow.AddHours(-24)
: DateTimeOffset.UtcNow.AddHours(-24 * (page + 1));
var celebPosts = await _celebPuller.PullCelebrityPostsAsync(
userId, since: windowStart, limit: pageSize, ct: ct);
// Merge and re-sort by timestamp, deduplicate
var merged = pushPosts
.Union(celebPosts, FeedPostDtoComparer.ById)
.OrderByDescending(p => p.CreatedAt)
.Take(pageSize)
.ToList();
return new FeedResponse(merged, HasMore: merged.Count == pageSize);
}
}
public record FeedPostDto(
Guid PostId,
Guid AuthorId,
string Caption,
List<string> MediaUrls,
int LikeCount,
DateTimeOffset CreatedAt);
public record FeedResponse(List<FeedPostDto> Posts, bool HasMore);
public class FeedPostDtoComparer : IEqualityComparer<FeedPostDto>
{
public static readonly FeedPostDtoComparer ById = new();
public bool Equals(FeedPostDto? x, FeedPostDto? y) => x?.PostId == y?.PostId;
public int GetHashCode(FeedPostDto obj) => obj.PostId.GetHashCode();
}The merge is done in memory after both sources are fetched. The total data volume is small (20 push IDs + up to 20 celebrity posts = at most 40 objects), so in-memory merge is faster than any SQL UNION across different data sources.
Follow Graph in Redis
The follow graph has two access patterns:
- "Who does user X follow?" — needed for celebrity list hydration
- "Who follows user X?" — needed for fanout
We store both directions in Redis:
public class FollowGraphService
{
private readonly IConnectionMultiplexer _redis;
private readonly SocialDbContext _db;
// Key conventions:
// following:{userId} → SET of user IDs that userId follows
// followers:{userId} → SET of user IDs that follow userId
// following:celebrities:{userId} → SET of celebrity IDs that userId follows
public FollowGraphService(IConnectionMultiplexer redis, SocialDbContext db)
{
_redis = redis;
_db = db;
}
public async Task FollowAsync(Guid followerId, Guid followeeId, CancellationToken ct = default)
{
// Persist to SQL for durability
var follow = new Follow
{
FollowerId = followerId,
FolloweeId = followeeId,
CreatedAt = DateTimeOffset.UtcNow
};
_db.Follows.Add(follow);
// Increment denormalized follower count
var followee = await _db.Users.FindAsync([followeeId], ct)
?? throw new NotFoundException(followeeId);
followee.IncrementFollowerCount();
bool becameCelebrity = followee.IsCelebrity;
await _db.SaveChangesAsync(ct);
// Update Redis graph
var db = _redis.GetDatabase();
var batch = db.CreateBatch();
batch.SetAddAsync($"following:{followerId}", followeeId.ToString());
batch.SetAddAsync($"followers:{followeeId}", followerId.ToString());
if (becameCelebrity)
{
// Mark as celebrity in all followers' celebrity sets
// (in practice we do this async via a background job)
batch.SetAddAsync($"following:celebrities:{followerId}", followeeId.ToString());
}
batch.Execute();
}
public async Task UnfollowAsync(Guid followerId, Guid followeeId, CancellationToken ct = default)
{
var follow = await _db.Follows
.FirstOrDefaultAsync(f => f.FollowerId == followerId
&& f.FolloweeId == followeeId, ct);
if (follow is null) return;
_db.Follows.Remove(follow);
var followee = await _db.Users.FindAsync([followeeId], ct)!;
followee?.DecrementFollowerCount();
await _db.SaveChangesAsync(ct);
var db = _redis.GetDatabase();
var batch = db.CreateBatch();
batch.SetRemoveAsync($"following:{followerId}", followeeId.ToString());
batch.SetRemoveAsync($"followers:{followeeId}", followerId.ToString());
batch.SetRemoveAsync($"following:celebrities:{followerId}", followeeId.ToString());
batch.Execute();
}
public async Task<List<Guid>> GetFollowersAsync(Guid userId, CancellationToken ct = default)
{
var db = _redis.GetDatabase();
var members = await db.SetMembersAsync($"followers:{userId}");
return members.Select(m => Guid.Parse(m.ToString())).ToList();
}
public async Task<List<Guid>> GetCelebritiesFollowedByAsync(
Guid userId,
CancellationToken ct = default)
{
var db = _redis.GetDatabase();
var members = await db.SetMembersAsync($"following:celebrities:{userId}");
return members.Select(m => Guid.Parse(m.ToString())).ToList();
}
}The celebrity SET (following:celebrities:{userId}) is the key performance optimization. Without it, the feed reader would have to check each followee's FollowerCount on every feed load to decide push vs. pull. By maintaining a pre-built celebrity SET per viewer, the pull decision is O(1).
The tradeoff: when a user crosses the 10,000-follower threshold, we need to retroactively populate the celebrity SET in all of their followers' Redis keys. We handle this with a background job triggered by the UserBecameCelebrityEvent domain event.
Media Upload Pipeline
Direct-to-blob uploads via presigned URLs avoid routing large files through the API servers. The flow:
- Client requests a presigned URL
- Client uploads directly to Azure Blob Storage (bypasses API)
- Blob storage triggers an Azure Function
- Function resizes, compresses, and writes processed variants
- Function updates CDN and marks asset as Ready in SQL
public class MediaUploadService
{
private readonly BlobServiceClient _blobClient;
private readonly SocialDbContext _db;
private readonly IConfiguration _config;
private const string RawContainer = "media-raw";
private const string ProcessedContainer = "media-processed";
public MediaUploadService(
BlobServiceClient blobClient,
SocialDbContext db,
IConfiguration config)
{
_blobClient = blobClient;
_db = db;
_config = config;
}
public async Task<PresignedUploadResponse> GeneratePresignedUploadAsync(
Guid userId,
string fileName,
string contentType,
CancellationToken ct = default)
{
var assetId = Guid.NewGuid();
var extension = Path.GetExtension(fileName);
var blobPath = $"{userId}/{assetId}{extension}";
var containerClient = _blobClient.GetBlobContainerClient(RawContainer);
var blobClient = containerClient.GetBlobClient(blobPath);
// Generate SAS URI valid for 15 minutes (upload window)
var sasBuilder = new BlobSasBuilder
{
BlobContainerName = RawContainer,
BlobName = blobPath,
Resource = "b",
ExpiresOn = DateTimeOffset.UtcNow.AddMinutes(15)
};
sasBuilder.SetPermissions(BlobSasPermissions.Write | BlobSasPermissions.Create);
var sasUri = blobClient.GenerateSasUri(sasBuilder);
// Pre-register the asset in SQL — status is Pending
var asset = new MediaAsset
{
Id = assetId,
BlobPath = blobPath,
ProcessingStatus = MediaProcessingStatus.Pending
};
_db.MediaAssets.Add(asset);
await _db.SaveChangesAsync(ct);
return new PresignedUploadResponse(assetId, sasUri.ToString(), blobPath);
}
}
public record PresignedUploadResponse(Guid AssetId, string UploadUrl, string BlobPath);The Azure Function that processes uploads:
// Azure Functions isolated worker
public class MediaProcessingFunction
{
private readonly BlobServiceClient _blobClient;
private readonly SocialDbContext _db;
private readonly string _cdnBaseUrl;
public MediaProcessingFunction(
BlobServiceClient blobClient,
SocialDbContext db,
IConfiguration config)
{
_blobClient = blobClient;
_db = db;
_cdnBaseUrl = config["CdnBaseUrl"]!;
}
[Function("ProcessUploadedMedia")]
public async Task Run(
[BlobTrigger("media-raw/{blobPath}", Connection = "AzureWebJobsStorage")]
Stream rawBlob,
string blobPath,
FunctionContext context)
{
var logger = context.GetLogger<MediaProcessingFunction>();
// Extract asset ID from blob path (format: userId/assetId.ext)
var fileName = Path.GetFileNameWithoutExtension(blobPath);
if (!Guid.TryParse(fileName.Split('/').Last(), out var assetId))
{
logger.LogError("Cannot parse asset ID from path: {Path}", blobPath);
return;
}
var asset = await _db.MediaAssets.FindAsync(assetId);
if (asset is null) return;
asset.ProcessingStatus = MediaProcessingStatus.Processing;
await _db.SaveChangesAsync();
try
{
// Resize to 1080px max width, compress to WebP
using var image = await Image.LoadAsync(rawBlob);
var targetWidth = Math.Min(image.Width, 1080);
image.Mutate(x => x.Resize(new ResizeOptions
{
Size = new Size(targetWidth, 0),
Mode = ResizeMode.Max
}));
var processedPath = blobPath.Replace("media-raw", "media-processed")
.Replace(Path.GetExtension(blobPath), ".webp");
var processedContainer = _blobClient.GetBlobContainerClient("media-processed");
var processedBlob = processedContainer.GetBlobClient(
processedPath.Replace("media-processed/", ""));
using var ms = new MemoryStream();
await image.SaveAsWebpAsync(ms);
ms.Position = 0;
await processedBlob.UploadAsync(ms, new BlobUploadOptions
{
HttpHeaders = new BlobHttpHeaders { ContentType = "image/webp" }
});
var cdnUrl = $"{_cdnBaseUrl}/{processedPath.Replace("media-processed/", "")}";
asset.ProcessedBlobPath = processedPath;
asset.CdnUrl = cdnUrl;
asset.WidthPx = image.Width;
asset.HeightPx = image.Height;
asset.ProcessingStatus = MediaProcessingStatus.Ready;
await _db.SaveChangesAsync();
logger.LogInformation("Media asset {AssetId} processed. CDN: {Url}", assetId, cdnUrl);
}
catch (Exception ex)
{
logger.LogError(ex, "Media processing failed for asset {AssetId}", assetId);
asset.ProcessingStatus = MediaProcessingStatus.Failed;
await _db.SaveChangesAsync();
}
}
}We use ImageSharp for image processing inside the Azure Function. It is managed, cross-platform, and does not depend on GDI+ (which is unreliable in Azure Functions on Linux). The output is always WebP — 25–35% smaller than JPEG at equivalent visual quality.
Notification Fan-Out
Likes and comments trigger two notification paths: real-time (SignalR) for currently-connected users, and email digest for offline users:
public class NotificationFanoutWorker : IConsumer<PostLikedEvent>
{
private readonly IHubContext<NotificationHub> _hubContext;
private readonly SocialDbContext _db;
private readonly IMessageBus _bus;
public NotificationFanoutWorker(
IHubContext<NotificationHub> hubContext,
SocialDbContext db,
IMessageBus bus)
{
_hubContext = hubContext;
_db = db;
_bus = bus;
}
public async Task Consume(ConsumeContext<PostLikedEvent> context)
{
var msg = context.Message;
// Real-time: push to post author if they are connected
await _hubContext.Clients
.Group($"user:{msg.PostAuthorId}")
.SendAsync("OnLikeReceived", new
{
msg.PostId,
msg.LikedByUserId,
msg.LikedByUsername,
Timestamp = DateTimeOffset.UtcNow
}, context.CancellationToken);
// Enqueue email digest entry — batched and sent by a separate worker once per hour
var digest = new EmailDigestEntry
{
RecipientUserId = msg.PostAuthorId,
EventType = "like",
PostId = msg.PostId,
ActorUserId = msg.LikedByUserId,
CreatedAt = DateTimeOffset.UtcNow,
SentInDigest = false
};
_db.EmailDigestEntries.Add(digest);
await _db.SaveChangesAsync(context.CancellationToken);
}
}
// SignalR hub — users join their own group on connect
public class NotificationHub : Hub
{
public override async Task OnConnectedAsync()
{
var userId = Context.User?.FindFirst("sub")?.Value;
if (userId is not null)
await Groups.AddToGroupAsync(Context.ConnectionId, $"user:{userId}");
await base.OnConnectedAsync();
}
}
// Email digest worker — fires every hour
public class EmailDigestWorker : BackgroundService
{
private readonly IServiceScopeFactory _scopeFactory;
private readonly IEmailSender _emailSender;
public EmailDigestWorker(IServiceScopeFactory scopeFactory, IEmailSender emailSender)
{
_scopeFactory = scopeFactory;
_emailSender = emailSender;
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
using var timer = new PeriodicTimer(TimeSpan.FromHours(1));
while (await timer.WaitForNextTickAsync(stoppingToken))
{
await using var scope = _scopeFactory.CreateAsyncScope();
var db = scope.ServiceProvider.GetRequiredService<SocialDbContext>();
var cutoff = DateTimeOffset.UtcNow.AddHours(-1);
var pending = await db.EmailDigestEntries
.Where(e => !e.SentInDigest && e.CreatedAt >= cutoff)
.GroupBy(e => e.RecipientUserId)
.ToListAsync(stoppingToken);
foreach (var group in pending)
{
await _emailSender.SendDigestAsync(group.Key, group.ToList());
foreach (var entry in group)
entry.SentInDigest = true;
}
await db.SaveChangesAsync(stoppingToken);
}
}
}SignalR groups keyed on user ID mean we do not need to track connection IDs manually. When a user has multiple devices connected, all receive the notification. Offline users get a consolidated hourly digest rather than per-event emails, which reduces noise and unsubscribe rates.
Challenges and How We Solved Them
Challenge 1: Celebrity Post Write Spike — Async Fanout via MassTransit
The problem. Before we implemented the hybrid strategy, every post went through fanout-on-write. A test account with 800,000 followers posted. The message consumer began creating 800,000 FeedItem rows. SQL writes saturated the connection pool. Other inserts (new posts from regular users) began queueing behind the bulk insert. The system appeared frozen for 4 minutes.
Why it happened. Fanout-on-write is synchronous by default in a naive implementation — one consumer, one loop, 800,000 iterations. Even with batched inserts, a single consumer thread is a bottleneck.
Solution: Partitioned consumer groups with rate-limited workers.
We restructured the fanout as a two-step process: the PostCreatedEvent consumer shards the work into batches of 500 followers and publishes FanoutBatchCommand messages. Multiple rate-limited consumer instances process batches in parallel:
public class PostCreatedEventConsumer : IConsumer<PostCreatedEvent>
{
private readonly SocialDbContext _db;
private readonly IBus _bus;
private const int BatchSize = 500;
public PostCreatedEventConsumer(SocialDbContext db, IBus bus)
{
_db = db;
_bus = bus;
}
public async Task Consume(ConsumeContext<PostCreatedEvent> context)
{
var msg = context.Message;
// Check if this is a celebrity post — if so, skip push fanout entirely
var author = await _db.Users.AsNoTracking()
.FirstOrDefaultAsync(u => u.Id == msg.AuthorId, context.CancellationToken);
if (author is null) return;
if (author.IsCelebrity)
{
// No fanout needed — readers will pull on demand
return;
}
// Load follower IDs in streaming fashion to avoid OOM on large followings
var followerIds = await _db.Follows
.AsNoTracking()
.Where(f => f.FolloweeId == msg.AuthorId)
.Select(f => f.FollowerId)
.ToListAsync(context.CancellationToken);
// Shard into batches and publish
var batches = followerIds
.Chunk(BatchSize)
.Select((chunk, index) => new FanoutBatchCommand(
msg.PostId,
msg.AuthorId,
msg.CreatedAt,
chunk.ToList(),
BatchIndex: index));
foreach (var batch in batches)
await context.Publish(batch);
}
}
public class FanoutBatchConsumer : IConsumer<FanoutBatchCommand>
{
private readonly SocialDbContext _db;
private readonly IConnectionMultiplexer _redis;
public FanoutBatchConsumer(SocialDbContext db, IConnectionMultiplexer redis)
{
_db = db;
_redis = redis;
}
public async Task Consume(ConsumeContext<FanoutBatchCommand> context)
{
var msg = context.Message;
// SQL bulk insert for this batch of followers
var feedItems = msg.FollowerIds.Select(fid => new FeedItem
{
RecipientUserId = fid,
PostId = msg.PostId,
AuthorId = msg.AuthorId,
PostedAt = msg.CreatedAt,
IsFromCelebrity = false
}).ToList();
await _db.FeedItems.AddRangeAsync(feedItems, context.CancellationToken);
await _db.SaveChangesAsync(context.CancellationToken);
// Redis ZSET update — pipeline all writes
var db = _redis.GetDatabase();
var score = msg.CreatedAt.ToUnixTimeMilliseconds();
var pipeline = db.CreateBatch();
foreach (var fid in msg.FollowerIds)
{
pipeline.SortedSetAddAsync($"feed:{fid}", msg.PostId.ToString(), score);
pipeline.SortedSetRemoveRangeByRankAsync($"feed:{fid}", 0, -1001);
}
pipeline.Execute();
}
}
// MassTransit configuration — rate-limited consumer
services.AddMassTransit(x =>
{
x.AddConsumer<FanoutBatchConsumer>(cfg =>
{
cfg.UseConcurrentMessageLimit(10); // 10 concurrent batch consumers
cfg.UseRateLimit(50, TimeSpan.FromSeconds(1)); // 50 batches/sec max
});
x.UsingRabbitMq((ctx, cfg) =>
{
cfg.ReceiveEndpoint("fanout-batches", ep =>
{
ep.ConfigureConsumer<FanoutBatchConsumer>(ctx);
ep.PrefetchCount = 20;
});
});
});
public record FanoutBatchCommand(
Guid PostId,
Guid AuthorId,
DateTimeOffset CreatedAt,
List<Guid> FollowerIds,
int BatchIndex);Result: the original 800,000-follower post now produces 1,600 batch messages of 500 followers each. 10 concurrent consumers each handle 160 batches. Total fanout time: approximately 35 seconds, with no impact on concurrent write throughput because the SQL inserts are spread across 10 parallel connections.
Challenge 2: Deleted Posts Still Appearing in Feeds — Tombstone Pattern
The problem. When a user deletes a post, we set DeletedAt on the Post entity (soft delete). But the post ID already exists in millions of followers' Redis feed ZSETs. The feed reader hydrates ZSETs from SQL — so deleted posts would be skipped. But there is a race: between soft delete and the feed reader querying SQL, a reader could fetch the post detail and briefly see deleted content.
More importantly, the Redis ZSETs still contain the post ID. Every feed load for every follower hits SQL with a post ID that returns no result. That is wasted SQL queries at scale.
Solution: tombstone pattern in Redis + async feed cleanup job.
On post deletion, we immediately write a tombstone to Redis. Feed readers check tombstones before hydrating. A background job then scans and removes deleted post IDs from ZSETs:
public class PostDeletionService
{
private readonly SocialDbContext _db;
private readonly IConnectionMultiplexer _redis;
private readonly IBus _bus;
private const string TombstoneKey = "deleted-posts"; // ZSET: postId → deletion Unix time
public PostDeletionService(SocialDbContext db, IConnectionMultiplexer redis, IBus bus)
{
_db = db;
_redis = redis;
_bus = bus;
}
public async Task DeletePostAsync(Guid postId, Guid requestingUserId, CancellationToken ct = default)
{
var post = await _db.Posts
.FirstOrDefaultAsync(p => p.Id == postId && p.AuthorId == requestingUserId, ct)
?? throw new NotFoundException(postId);
post.SoftDelete();
await _db.SaveChangesAsync(ct);
// Write tombstone immediately — fast path for feed readers
var db = _redis.GetDatabase();
await db.SortedSetAddAsync(
TombstoneKey,
postId.ToString(),
DateTimeOffset.UtcNow.ToUnixTimeSeconds());
// Publish for async feed cleanup
await _bus.Publish(new PostDeletedEvent(postId, post.AuthorId), ct);
}
}
// Updated FeedService — checks tombstone before returning posts
public class FeedService
{
// ... existing fields ...
private async Task<HashSet<string>> GetTombstonedPostIdsAsync(List<Guid> postIds)
{
var db = _redis.GetDatabase();
var tombstoneKey = "deleted-posts";
// Check which of these post IDs are tombstoned
var scores = await db.SortedSetScoresAsync(
tombstoneKey,
postIds.Select(id => (RedisValue)id.ToString()).ToArray());
return postIds
.Zip(scores)
.Where(pair => pair.Second.HasValue)
.Select(pair => pair.First.ToString())
.ToHashSet();
}
public async Task<FeedResponse> GetFeedAsync(
Guid userId, int page = 0, int pageSize = 20, CancellationToken ct = default)
{
var redisDb = _redis.GetDatabase();
var feedKey = $"feed:{userId}";
var postEntries = await redisDb.SortedSetRangeByRankWithScoresAsync(
feedKey, page * pageSize, (page * pageSize) + pageSize - 1, Order.Descending);
var postIds = postEntries
.Select(e => Guid.Parse(e.Element.ToString()))
.ToList();
// Filter tombstoned posts before hitting SQL
var tombstoned = await GetTombstonedPostIdsAsync(postIds);
var validPostIds = postIds
.Where(id => !tombstoned.Contains(id.ToString()))
.ToList();
// Hydrate only valid posts
var pushPosts = validPostIds.Count > 0
? await _db.Posts
.AsNoTracking()
.Where(p => validPostIds.Contains(p.Id) && !p.IsDeleted)
.Include(p => p.Media)
.Select(p => new FeedPostDto(
p.Id, p.AuthorId, p.Caption,
p.Media.Select(m => m.CdnUrl ?? string.Empty).ToList(),
p.LikeCount, p.CreatedAt))
.ToListAsync(ct)
: [];
var celebPosts = await _celebPuller.PullCelebrityPostsAsync(
userId, DateTimeOffset.UtcNow.AddHours(-24), pageSize, ct);
var merged = pushPosts
.Union(celebPosts, FeedPostDtoComparer.ById)
.OrderByDescending(p => p.CreatedAt)
.Take(pageSize)
.ToList();
return new FeedResponse(merged, merged.Count == pageSize);
}
}The async cleanup job removes deleted post IDs from individual user feed ZSETs. This is done lazily — the tombstone check is the fast path, and the cleanup is best-effort:
public class FeedCleanupConsumer : IConsumer<PostDeletedEvent>
{
private readonly IConnectionMultiplexer _redis;
private readonly SocialDbContext _db;
private readonly ILogger<FeedCleanupConsumer> _logger;
public FeedCleanupConsumer(
IConnectionMultiplexer redis,
SocialDbContext db,
ILogger<FeedCleanupConsumer> logger)
{
_redis = redis;
_db = db;
_logger = logger;
}
public async Task Consume(ConsumeContext<PostDeletedEvent> context)
{
var msg = context.Message;
// Get all followers of the post author
var followerIds = await _db.Follows
.AsNoTracking()
.Where(f => f.FolloweeId == msg.AuthorId)
.Select(f => f.FollowerId)
.ToListAsync(context.CancellationToken);
_logger.LogInformation(
"Cleaning post {PostId} from {Count} feed ZSETs.", msg.PostId, followerIds.Count);
var db = _redis.GetDatabase();
var postIdValue = (RedisValue)msg.PostId.ToString();
// Process in chunks to avoid blocking Redis with huge pipelines
foreach (var chunk in followerIds.Chunk(200))
{
var pipeline = db.CreateBatch();
foreach (var followerId in chunk)
pipeline.SortedSetRemoveAsync($"feed:{followerId}", postIdValue);
pipeline.Execute();
await Task.Delay(10, context.CancellationToken); // brief yield between chunks
}
// Tombstone can expire after 7 days — any feed ZSET older than that
// will not contain this post ID anyway
await db.KeyExpireAsync(
"deleted-posts",
DateTimeOffset.UtcNow.AddDays(7) - DateTimeOffset.UtcNow);
}
}The tombstone acts as a read-time guard with zero latency overhead (one Redis ZSET score lookup per post ID). The background cleanup is a correctness improvement — it removes the IDs so future feed reads do not even include them in the tombstone check. The tombstone TTL of 7 days covers the worst-case cleanup delay.
.NET Implementation Patterns
EF Core Configuration for the Social Schema
public class SocialDbContext : DbContext
{
public DbSet<User> Users => Set<User>();
public DbSet<Post> Posts => Set<Post>();
public DbSet<Follow> Follows => Set<Follow>();
public DbSet<FeedItem> FeedItems => Set<FeedItem>();
public DbSet<MediaAsset> MediaAssets => Set<MediaAsset>();
public DbSet<EmailDigestEntry> EmailDigestEntries => Set<EmailDigestEntry>();
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
// Soft-delete filter — posts with DeletedAt are invisible by default
modelBuilder.Entity<Post>()
.HasQueryFilter(p => p.DeletedAt == null);
// Follow: composite PK, index on both directions
modelBuilder.Entity<Follow>()
.HasKey(f => new { f.FollowerId, f.FolloweeId });
modelBuilder.Entity<Follow>()
.HasIndex(f => f.FolloweeId)
.HasDatabaseName("IX_Follows_Followee");
// FeedItem: primary access pattern is recipient + time descending
modelBuilder.Entity<FeedItem>()
.HasIndex(fi => new { fi.RecipientUserId, fi.PostedAt })
.HasDatabaseName("IX_FeedItems_Recipient_PostedAt")
.IsDescending(false, true);
// Partial index: only unprocessed media assets are queried by worker
modelBuilder.Entity<MediaAsset>()
.HasIndex(m => m.ProcessingStatus)
.HasFilter("[ProcessingStatus] IN (0, 1)") // Pending, Processing
.HasDatabaseName("IX_MediaAssets_PendingProcessing");
// Email digest: unprocessed entries only
modelBuilder.Entity<EmailDigestEntry>()
.HasIndex(e => new { e.RecipientUserId, e.SentInDigest })
.HasFilter("[SentInDigest] = 0")
.HasDatabaseName("IX_EmailDigest_Unsent");
}
}Program Registration
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddDbContext<SocialDbContext>(options =>
options.UseSqlServer(builder.Configuration.GetConnectionString("Social")));
// Redis
builder.Services.AddSingleton<IConnectionMultiplexer>(_ =>
ConnectionMultiplexer.Connect(builder.Configuration.GetConnectionString("Redis")!));
// Azure Blob Storage
builder.Services.AddSingleton(_ =>
new BlobServiceClient(builder.Configuration.GetConnectionString("AzureStorage")));
// Domain services
builder.Services.AddScoped<FeedService>();
builder.Services.AddScoped<FollowGraphService>();
builder.Services.AddScoped<CelebrityFeedPuller>();
builder.Services.AddScoped<MediaUploadService>();
builder.Services.AddScoped<PostDeletionService>();
// SignalR
builder.Services.AddSignalR();
// MassTransit
builder.Services.AddMassTransit(x =>
{
x.AddConsumer<PostCreatedEventConsumer>();
x.AddConsumer<FanoutBatchConsumer>(cfg =>
{
cfg.UseConcurrentMessageLimit(10);
cfg.UseRateLimit(50, TimeSpan.FromSeconds(1));
});
x.AddConsumer<FeedCleanupConsumer>();
x.AddConsumer<NotificationFanoutWorker>();
x.UsingRabbitMq((ctx, cfg) => cfg.ConfigureEndpoints(ctx));
});
// Background workers
builder.Services.AddHostedService<EmailDigestWorker>();
var app = builder.Build();
app.MapHub<NotificationHub>("/hubs/notifications");
app.MapControllers();
app.Run();What We'd Do Differently
1. Pre-compute the celebrity threshold at follow time, not read time. We currently check FollowerCount >= 10_000 in the feed reader to decide push vs pull. This means the threshold logic is spread across three places: follow handler, feed reader, and a background sync job. It should be a single domain event — UserBecameCelebrityEvent — that fires exactly once at threshold crossing and triggers all downstream effects atomically.
2. Use Lua scripts for atomic Redis ZSET operations. The fanout pipeline writes two Redis commands per follower (ZADD + ZREMRANGEBYRANK). Between these two commands, another reader could see an over-full ZSET. A Lua script makes the pair atomic and reduces round trips.
3. CDN cache invalidation on post deletion. We soft-delete posts and tombstone feed ZSETs, but the media CDN still serves the images. A user can share the direct CDN URL before the post is deleted and the image remains accessible indefinitely. We need to either issue CDN purge requests on deletion or use signed CDN URLs with short TTLs. Signed URLs are the better long-term solution but add latency to every image request.
4. Pagination via cursor, not page offset. The feed reader uses page * pageSize for ZSET range queries. This is stable for individual users but the hybrid merge (push + celebrity pull) means the effective result count per page varies. Cursor-based pagination (using the last-seen post timestamp as the cursor) would be more predictable and avoids skipped items when new posts arrive between pages.
5. Instrument the celebrity threshold empirically. We chose 10,000 followers as the fanout threshold based on intuition. We should measure actual fanout time at different follower counts in production and set the threshold dynamically based on p95 fanout latency, not a hardcoded constant. An account with 9,000 followers posting during peak traffic can be just as disruptive as one with 15,000 during off-peak.
6. Separate the email digest store from the main DB. The EmailDigestEntry table grows at the rate of all notification events across all users. At 1 million active users with an average of 20 notification events per day, that is 20 million rows per day. After 90 days the table has 1.8 billion rows. A dedicated time-series or NoSQL store (Cosmos DB or TimescaleDB) with automatic TTL-based expiry would be far cheaper than SQL Server at that volume.