<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Madhav's Blogs]]></title><description><![CDATA[Madhav's Blogs]]></description><link>https://blog.madhav.dev</link><generator>RSS for Node</generator><lastBuildDate>Sat, 16 May 2026 04:44:04 GMT</lastBuildDate><atom:link href="https://blog.madhav.dev/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[How I'd Design a Chat System Like WhatsApp — WebSockets, Message Delivery, and Scaling to Millions]]></title><description><![CDATA[What Most Chat System Design Articles Get Wrong
Search "design a chat system" and you'll find hundreds of articles. Most of them give you a box labeled "Chat Server" with some arrows pointing at a dat]]></description><link>https://blog.madhav.dev/how-i-d-design-a-chat-system-like-whatsapp-websockets-message-delivery-and-scaling-to-millions</link><guid isPermaLink="true">https://blog.madhav.dev/how-i-d-design-a-chat-system-like-whatsapp-websockets-message-delivery-and-scaling-to-millions</guid><category><![CDATA[System Design]]></category><category><![CDATA[websockets]]></category><category><![CDATA[AWS]]></category><category><![CDATA[Python]]></category><category><![CDATA[kafka]]></category><category><![CDATA[backend]]></category><dc:creator><![CDATA[Madhav Bhasin]]></dc:creator><pubDate>Fri, 17 Apr 2026 11:50:15 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/622ac3bee552e99d82e59b17/333a33aa-f12a-4a24-86b4-3a18600f3917.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>What Most Chat System Design Articles Get Wrong</h2>
<p>Search "design a chat system" and you'll find hundreds of articles. Most of them give you a box labeled "Chat Server" with some arrows pointing at a database and call it a day.</p>
<p>That's not a design. That's a diagram with no decisions in it.</p>
<p>The real challenge in designing a chat system isn't the happy path — it's everything that goes wrong. What happens when a user goes offline mid-message? How do you guarantee a message sent to a group of 500 people actually reaches all of them? What does "delivered" even mean at scale?</p>
<p>This post covers the full architecture for a WhatsApp-style chat system supporting 1-on-1 and group messaging — with real decisions, real tradeoffs, and the three technical problems that will define your design: WebSockets, message delivery guarantees, and scaling to millions of concurrent users.</p>
<hr />
<h2>Requirements</h2>
<p>Before touching architecture, let's be precise about what we're building.</p>
<p><strong>Functional requirements:</strong></p>
<ul>
<li><p>1-on-1 messaging between users</p>
</li>
<li><p>Group messaging (up to 500 members per group)</p>
</li>
<li><p>Message delivery receipts (sent, delivered, read)</p>
</li>
<li><p>Online/offline presence indicators</p>
</li>
<li><p>Message history persistence</p>
</li>
<li><p>Media sharing (images, files) — basic support</p>
</li>
</ul>
<p><strong>Non-functional requirements:</strong></p>
<ul>
<li><p>Low latency — messages should feel real-time (under 100ms end-to-end)</p>
</li>
<li><p>High availability — 99.99% uptime, chat can't go down</p>
</li>
<li><p>At-least-once delivery — no message ever silently lost</p>
</li>
<li><p>Eventual consistency — slight delay in delivery receipts is acceptable</p>
</li>
<li><p>Scale — 50 million daily active users, 100 messages per user per day = 5 billion messages per day</p>
</li>
</ul>
<p>Let's use these constraints to drive every architectural decision.</p>
<hr />
<h2>The Naive Approach (And Why It Fails)</h2>
<p>The obvious first attempt:</p>
<pre><code class="language-plaintext">User A → HTTP POST /messages → Server → Database → Poll for new messages → User B
</code></pre>
<p>User B polls every few seconds asking "any new messages?" This works for email. It's catastrophic for chat.</p>
<p>At 50M users polling every 3 seconds:</p>
<ul>
<li><p><strong>16.6 million requests per second</strong> — just for polling</p>
</li>
<li><p>Most responses are empty — wasted compute</p>
</li>
<li><p>Minimum latency is your polling interval — 3 seconds feels broken for chat</p>
</li>
</ul>
<p>You need persistent connections. Enter WebSockets.</p>
<hr />
<h2>Part 1: WebSockets — The Foundation of Real-Time Messaging</h2>
<h3>How WebSockets Work</h3>
<p>HTTP is request-response — the client always initiates. WebSockets upgrade an HTTP connection to a persistent, bidirectional channel. Once established, either side can send data at any time.</p>
<pre><code class="language-plaintext">Client                          Server
  │                               │
  │── HTTP GET /chat ──────────►  │
  │   Upgrade: websocket          │
  │◄─ 101 Switching Protocols ──  │
  │                               │
  │  ◄──── persistent connection ────►  │
  │                               │
  │◄── {"type":"message",...} ──  │  (server pushes)
  │── {"type":"ack",...} ────────►│  (client responds)
</code></pre>
<p><strong>The connection lifecycle:</strong></p>
<pre><code class="language-python"># FastAPI WebSocket endpoint
from fastapi import FastAPI, WebSocket, WebSocketDisconnect
from typing import Dict
import json

app = FastAPI()

class ConnectionManager:
    def __init__(self):
        # user_id → WebSocket connection
        self.active_connections: Dict[str, WebSocket] = {}

    async def connect(self, user_id: str, websocket: WebSocket):
        await websocket.accept()
        self.active_connections[user_id] = websocket
        await self.update_presence(user_id, online=True)

    async def disconnect(self, user_id: str):
        self.active_connections.pop(user_id, None)
        await self.update_presence(user_id, online=False)

    async def send_to_user(self, user_id: str, message: dict) -&gt; bool:
        websocket = self.active_connections.get(user_id)
        if websocket:
            await websocket.send_json(message)
            return True
        return False  # User offline — message needs queuing

    async def update_presence(self, user_id: str, online: bool):
        # Publish to Redis pub/sub so other servers know
        await redis.publish(
            f"presence:{user_id}",
            json.dumps({"user_id": user_id, "online": online})
        )

manager = ConnectionManager()

@app.websocket("/ws/{user_id}")
async def websocket_endpoint(websocket: WebSocket, user_id: str):
    await manager.connect(user_id, websocket)
    try:
        while True:
            data = await websocket.receive_json()
            await handle_message(user_id, data)
    except WebSocketDisconnect:
        await manager.disconnect(user_id)
</code></pre>
<h3>The Problem With WebSockets at Scale</h3>
<p>A single server can maintain roughly <strong>65,000 concurrent WebSocket connections</strong> (limited by OS file descriptors). At 50M DAU with 20% concurrently online, that's 10M concurrent connections — requiring <strong>153+ chat servers</strong> just to hold connections.</p>
<p>This creates the core architectural challenge: <strong>User A's connection is on Server 1. User B's connection is on Server 7. How does a message from A reach B?</strong></p>
<p>The answer is a message broker acting as the backbone between servers.</p>
<hr />
<h2>The Architecture</h2>
<p>Here's the full system:</p>
<pre><code class="language-plaintext">                    ┌─────────────────────────────┐
                    │         Clients              │
                    │   (WebSocket connections)    │
                    └──────────┬──────────────────┘
                               │
                    ┌──────────▼──────────────────┐
                    │      Load Balancer           │
                    │  (sticky sessions by user)   │
                    └──────────┬──────────────────┘
                               │
          ┌────────────────────┼────────────────────┐
          │                    │                    │
   ┌──────▼──────┐    ┌────────▼──────┐    ┌───────▼──────┐
   │  Chat Server│    │  Chat Server  │    │  Chat Server  │
   │     #1      │    │     #2        │    │     #3        │
   └──────┬──────┘    └──────┬────────┘    └──────┬───────┘
          │                  │                    │
          └──────────────────┼────────────────────┘
                             │
                    ┌────────▼────────────┐
                    │   Message Broker    │
                    │  (Kafka / SQS)      │
                    └────────┬────────────┘
                             │
          ┌──────────────────┼────────────────────┐
          │                  │                    │
   ┌──────▼──────┐    ┌──────▼──────┐    ┌───────▼──────┐
   │  Message    │    │   Presence  │    │  Push Notif  │
   │  Storage    │    │   Service   │    │  Service     │
   │ (DynamoDB)  │    │  (Redis)    │    │ (APNs/FCM)   │
   └─────────────┘    └─────────────┘    └──────────────┘
</code></pre>
<h3>Component Responsibilities</h3>
<p><strong>Load Balancer — Sticky Sessions</strong></p>
<p>Route each user to the same chat server for the duration of their session. This keeps WebSocket connections stable and avoids re-routing overhead. Use consistent hashing on <code>user_id</code>.</p>
<p><strong>Chat Servers — Connection Holders</strong></p>
<p>Each chat server does three things:</p>
<ol>
<li><p>Maintains WebSocket connections for its assigned users</p>
</li>
<li><p>Receives incoming messages and publishes to the message broker</p>
</li>
<li><p>Subscribes to the message broker and delivers messages to connected users</p>
</li>
</ol>
<p><strong>Message Broker (Kafka)</strong></p>
<p>The backbone. Every message flows through Kafka, which provides:</p>
<ul>
<li><p>Guaranteed delivery between chat servers</p>
</li>
<li><p>Replay capability for debugging</p>
</li>
<li><p>Fan-out for group messages</p>
</li>
<li><p>Decoupling of send and receive paths</p>
</li>
</ul>
<p><strong>Message Storage (DynamoDB)</strong></p>
<p>Persistent message history. DynamoDB's access pattern fits perfectly — you almost always query by <code>conversation_id</code> to get recent messages.</p>
<p><strong>Presence Service (Redis)</strong></p>
<p>Tracks who is online. Redis pub/sub broadcasts presence changes to all interested servers in real time.</p>
<p><strong>Push Notification Service</strong></p>
<p>When a recipient is offline, route to APNs (iOS) or FCM (Android) instead of a WebSocket.</p>
<hr />
<h2>Part 2: Message Delivery Guarantees — The Hard Part</h2>
<p>This is where most system design discussions stop too early. Let's go deeper.</p>
<h3>The Three Guarantees</h3>
<p><strong>At-most-once:</strong> Message is sent once. If delivery fails, it's not retried. Messages can be lost. Never acceptable for chat.</p>
<p><strong>At-least-once:</strong> Message is retried until acknowledged. Messages might be delivered multiple times (duplicates). Acceptable if you handle deduplication.</p>
<p><strong>Exactly-once:</strong> Message is delivered precisely once, no duplicates, no losses. Theoretically ideal. In practice, extremely hard to implement correctly across distributed systems.</p>
<p><strong>For chat systems: target at-least-once delivery with client-side deduplication.</strong></p>
<h3>The Message Flow With Delivery Guarantees</h3>
<pre><code class="language-plaintext">1. Sender assigns a client-generated idempotency key (UUID)
2. Message sent to Chat Server
3. Chat Server persists to DynamoDB (idempotency key as sort key)
4. Chat Server publishes to Kafka
5. Chat Server returns ACK to sender → sender marks as "sent" ✓
6. Kafka consumer delivers to recipient's Chat Server
7. Recipient's Chat Server delivers via WebSocket
8. Recipient's client sends ACK
9. Chat Server publishes delivery receipt to Kafka
10. Sender's Chat Server receives receipt → sender marks "delivered" ✓
11. Recipient opens conversation → "read" receipt sent
12. Sender marks "read" ✓
</code></pre>
<p><strong>The idempotency key is critical.</strong> If step 3 succeeds but the server crashes before step 5, the sender will retry. Without an idempotency key, you get duplicate messages. With it, the database insert is a no-op on retry.</p>
<pre><code class="language-python">import uuid
from datetime import datetime

class Message:
    def __init__(
        self,
        sender_id: str,
        conversation_id: str,
        content: str,
        idempotency_key: str = None
    ):
        self.message_id = str(uuid.uuid4())
        self.idempotency_key = idempotency_key or str(uuid.uuid4())
        self.sender_id = sender_id
        self.conversation_id = conversation_id
        self.content = content
        self.timestamp = datetime.utcnow().isoformat()
        self.status = "sent"

async def send_message(message: Message) -&gt; dict:
    # Idempotent write — if key exists, return existing record
    try:
        await dynamodb.put_item(
            TableName="messages",
            Item={
                "conversation_id": {"S": message.conversation_id},
                "message_id": {"S": message.message_id},
                "idempotency_key": {"S": message.idempotency_key},
                "sender_id": {"S": message.sender_id},
                "content": {"S": message.content},
                "timestamp": {"S": message.timestamp},
                "status": {"S": "sent"}
            },
            ConditionExpression="attribute_not_exists(idempotency_key)"
        )
    except dynamodb.exceptions.ConditionalCheckFailedException:
        # Already saved — idempotent, return success
        pass

    # Publish to Kafka regardless (Kafka consumer handles dedup)
    await kafka_producer.send(
        topic="messages",
        key=message.conversation_id.encode(),
        value=message.__dict__
    )

    return {"status": "sent", "message_id": message.message_id}
</code></pre>
<h3>Handling Offline Users</h3>
<p>When a recipient is offline, the message must not be lost. The flow changes:</p>
<pre><code class="language-python">async def deliver_message(message: dict, recipient_id: str):
    # Try WebSocket first
    delivered = await manager.send_to_user(recipient_id, message)

    if not delivered:
        # User offline — store in their message queue
        await dynamodb.put_item(
            TableName="offline_queue",
            Item={
                "user_id": {"S": recipient_id},
                "timestamp": {"S": message["timestamp"]},
                "message_id": {"S": message["message_id"]},
                "message": {"S": json.dumps(message)}
            }
        )
        # Send push notification
        await push_service.notify(
            user_id=recipient_id,
            title=f"New message from {message['sender_id']}",
            body=message["content"][:100]
        )

async def on_user_connect(user_id: str):
    # Drain offline queue on reconnect
    queued = await dynamodb.query(
        TableName="offline_queue",
        KeyConditionExpression="user_id = :uid",
        ExpressionAttributeValues={":uid": {"S": user_id}},
        ScanIndexForward=True  # oldest first
    )

    for item in queued["Items"]:
        message = json.loads(item["message"]["S"])
        await manager.send_to_user(user_id, message)

    # Clear the queue
    for item in queued["Items"]:
        await dynamodb.delete_item(
            TableName="offline_queue",
            Key={
                "user_id": {"S": user_id},
                "timestamp": {"S": item["timestamp"]["S"]}
            }
        )
</code></pre>
<hr />
<h2>Part 3: Group Messaging — The Fan-Out Problem</h2>
<p>1-on-1 messaging is a solved problem at this point. Group messaging is where things get genuinely hard.</p>
<h3>The Fan-Out Challenge</h3>
<p>When User A sends a message to a group of 500 people:</p>
<ul>
<li><p>500 delivery operations need to happen</p>
</li>
<li><p>Members are spread across many chat servers</p>
</li>
<li><p>Some members are offline</p>
</li>
<li><p>Some members have the conversation muted</p>
</li>
<li><p>The operation needs to be fast from A's perspective</p>
</li>
</ul>
<p><strong>Two approaches:</strong></p>
<p><strong>Fan-out on write:</strong> When a message is sent, immediately write it to every member's inbox. Each member pulls from their own inbox.</p>
<pre><code class="language-plaintext">Pros: Fast reads — inbox is pre-computed
Cons: Expensive writes — 500 writes per message
     Wasteful for large groups with many inactive members
</code></pre>
<p><strong>Fan-out on read:</strong> Store the message once. When a member opens the conversation, compute their view.</p>
<pre><code class="language-plaintext">Pros: Single write per message, storage efficient
Cons: Expensive reads — compute on every open
     Slower first load for large groups
</code></pre>
<p><strong>Hybrid approach (what WhatsApp actually uses):</strong></p>
<ul>
<li><p>For small groups (&lt; 100 members): fan-out on write — fast delivery to active members</p>
</li>
<li><p>For large groups (100-500 members): store once, use Kafka partitions per group, let each member's server pull</p>
</li>
</ul>
<pre><code class="language-python">SMALL_GROUP_THRESHOLD = 100

async def handle_group_message(message: Message, group_id: str):
    # Get group members
    members = await get_group_members(group_id)

    if len(members) &lt;= SMALL_GROUP_THRESHOLD:
        # Fan-out on write — direct delivery
        await fan_out_write(message, members)
    else:
        # Store once, publish to group topic
        await store_message(message)
        await kafka_producer.send(
            topic=f"group.{group_id}",
            value=message.__dict__
        )

async def fan_out_write(message: Message, members: list):
    # Publish one Kafka event per member
    tasks = [
        kafka_producer.send(
            topic=f"user.{member_id}",
            value=message.__dict__
        )
        for member_id in members
        if member_id != message.sender_id
    ]
    await asyncio.gather(*tasks)
</code></pre>
<h3>Group Delivery Receipts — Don't Do What iMessage Does</h3>
<p>Showing individual read receipts for every member in a 500-person group is a scaling nightmare — 500 receipt events per message read.</p>
<p><strong>Practical approach:</strong></p>
<ul>
<li><p>Store receipts as a bitmap or counter, not individual records</p>
</li>
<li><p>Show "Delivered to N members" rather than individual names</p>
</li>
<li><p>Only compute individual receipts for groups under 20 members</p>
</li>
</ul>
<pre><code class="language-python">async def update_group_receipt(
    message_id: str,
    group_id: str,
    user_id: str,
    receipt_type: str  # "delivered" or "read"
):
    # Atomic increment — no race conditions
    await dynamodb.update_item(
        TableName="group_receipts",
        Key={
            "message_id": {"S": message_id},
            "group_id": {"S": group_id}
        },
        UpdateExpression=f"ADD {receipt_type}_count :inc",
        ExpressionAttributeValues={":inc": {"N": "1"}}
    )
</code></pre>
<hr />
<h2>DynamoDB Schema</h2>
<p>The access patterns for chat are simple but the schema design matters enormously for cost and performance.</p>
<pre><code class="language-plaintext">Table: messages
├── PK: conversation_id (String)     ← partition by conversation
├── SK: timestamp#message_id (String) ← sort by time, unique
├── sender_id (String)
├── content (String)
├── message_type (String)            ← text, image, video
├── status (String)                  ← sent, delivered, read
└── idempotency_key (String)

GSI: sender_id-timestamp-index
├── PK: sender_id
└── SK: timestamp
(For "messages sent by user" queries)

Table: conversations
├── PK: user_id (String)
├── SK: last_message_timestamp (String)
├── conversation_id (String)
├── participant_ids (List)
└── unread_count (Number)

Table: offline_queue
├── PK: user_id (String)
└── SK: timestamp (String)
</code></pre>
<p><strong>Key design decisions:</strong></p>
<ul>
<li><p>Partition by <code>conversation_id</code> not <code>user_id</code> — conversations are the natural unit of access</p>
</li>
<li><p>Use composite sort key <code>timestamp#message_id</code> — enables time-range queries and guarantees uniqueness even for messages sent at the same millisecond</p>
</li>
<li><p>TTL on <code>offline_queue</code> — auto-expire after 30 days so storage doesn't grow unbounded</p>
</li>
</ul>
<hr />
<h2>Presence System</h2>
<p>Real-time online/offline indicators are deceptively complex at scale.</p>
<p><strong>The naive approach:</strong> Query the database for last-seen timestamp on every profile view. At scale, this creates a read hotspot.</p>
<p><strong>Better approach:</strong> Redis with TTL + pub/sub</p>
<pre><code class="language-python">PRESENCE_TTL = 60  # seconds

async def heartbeat(user_id: str):
    # Client sends heartbeat every 30 seconds
    # Server refreshes TTL — if it expires, user is offline
    await redis.setex(
        f"presence:{user_id}",
        PRESENCE_TTL,
        "online"
    )

async def is_online(user_id: str) -&gt; bool:
    return await redis.exists(f"presence:{user_id}") == 1

async def subscribe_to_presence(user_ids: list, callback):
    # Subscribe to presence changes for a list of users
    # Used to update UI in real time when contacts go online/offline
    pubsub = redis.pubsub()
    channels = [f"presence:{uid}" for uid in user_ids]
    await pubsub.subscribe(*channels)

    async for message in pubsub.listen():
        if message["type"] == "message":
            await callback(json.loads(message["data"]))
</code></pre>
<p><strong>Presence at scale consideration:</strong> Don't broadcast presence to all followers. WhatsApp only shows presence to mutual contacts — and even then, only when the user opens a conversation. This limits the fan-out to manageable levels.</p>
<hr />
<h2>Scaling to Millions of Users</h2>
<h3>The Numbers</h3>
<table>
<thead>
<tr>
<th>Metric</th>
<th>Value</th>
</tr>
</thead>
<tbody><tr>
<td>Daily Active Users</td>
<td>50 million</td>
</tr>
<tr>
<td>Concurrent connections (20%)</td>
<td>10 million</td>
</tr>
<tr>
<td>Messages per day</td>
<td>5 billion</td>
</tr>
<tr>
<td>Messages per second (peak 3x avg)</td>
<td>~174,000</td>
</tr>
<tr>
<td>Average message size</td>
<td>1 KB</td>
</tr>
<tr>
<td>Storage per day</td>
<td>~5 TB</td>
</tr>
</tbody></table>
<h3>Horizontal Scaling Plan</h3>
<p><strong>Chat servers:</strong> Stateless except for active WebSocket connections. Scale horizontally — add servers as concurrent connections grow. Target: 50,000 connections per server = 200 servers at peak.</p>
<p><strong>Kafka:</strong> Partition by <code>conversation_id</code>. This ensures all messages in a conversation are ordered and processed by the same consumer. Use 1,000 partitions — allows scaling to 1,000 parallel consumers.</p>
<p><strong>DynamoDB:</strong> Serverless — scales automatically. At 174,000 writes/second, provision ~200,000 WCU with auto-scaling. Cost at this scale: ~$35,000/month. Optimise with DynamoDB Accelerator (DAX) for read-heavy workloads like message history.</p>
<p><strong>Redis (Presence):</strong> Use Redis Cluster. Shard by <code>user_id</code>. At 10M concurrent users, each key is ~100 bytes = ~1GB total — fits comfortably in a 3-node Redis cluster.</p>
<h3>Geographic Distribution</h3>
<p>For a global user base, co-locate users with the closest region:</p>
<pre><code class="language-plaintext">User in Singapore → AP-Southeast Chat Servers → AP Kafka Cluster
User in London    → EU-West Chat Servers    → EU Kafka Cluster

Cross-region messages:
AP Kafka → Cross-region replication → EU Kafka → EU Chat Server → User
</code></pre>
<p>Use AWS Global Accelerator to route users to the nearest chat server cluster. Accept ~50ms cross-region latency for international messages — it's acceptable and much cheaper than a single global cluster.</p>
<hr />
<h2>Media Sharing — The Bandwidth Problem</h2>
<p>Never send media through the chat server. It'll saturate your WebSocket connections.</p>
<p><strong>The right approach:</strong></p>
<pre><code class="language-plaintext">1. Client requests a pre-signed S3 URL from the media service
2. Client uploads directly to S3 (bypasses chat server entirely)
3. Client sends a message containing the S3 object key
4. Recipient's client downloads directly from CloudFront (CDN)
</code></pre>
<pre><code class="language-python">import boto3
from botocore.config import Config

s3 = boto3.client(
    's3',
    config=Config(signature_version='s3v4')
)

async def get_upload_url(
    user_id: str,
    file_type: str,
    file_size_bytes: int
) -&gt; dict:
    # Validate file size (50MB limit)
    if file_size_bytes &gt; 50 * 1024 * 1024:
        raise ValueError("File too large")

    object_key = f"media/{user_id}/{uuid.uuid4()}"

    presigned_url = s3.generate_presigned_url(
        'put_object',
        Params={
            'Bucket': 'chat-media-bucket',
            'Key': object_key,
            'ContentType': file_type,
            'ContentLength': file_size_bytes
        },
        ExpiresIn=300  # 5 minutes to complete upload
    )

    return {
        "upload_url": presigned_url,
        "object_key": object_key,
        "cdn_url": f"https://cdn.yourdomain.com/{object_key}"
    }
</code></pre>
<p>This keeps your chat servers lean — they only handle small JSON payloads, never binary data.</p>
<hr />
<h2>What I'd Skip in V1</h2>
<p>Not everything needs to be built on day one:</p>
<ul>
<li><p><strong>End-to-end encryption</strong> — Signal Protocol is complex. Add it in V2 once the core is stable.</p>
</li>
<li><p><strong>Voice and video calls</strong> — WebRTC is a separate system entirely. Separate service, separate team.</p>
</li>
<li><p><strong>Message reactions</strong> — Nice to have. A simple DynamoDB list attribute works until you have emoji reaction analytics requirements.</p>
</li>
<li><p><strong>Message editing and deletion</strong> — Soft delete first (mark as deleted), hard delete later when compliance requirements are clear.</p>
</li>
<li><p><strong>Multi-device sync</strong> — Start with single-device. Multi-device sync (like WhatsApp Web) is a significant engineering effort involving device registration and message fan-out to device sets.</p>
</li>
</ul>
<hr />
<h2>Key Takeaways</h2>
<ol>
<li><p><strong>WebSockets are necessary but not sufficient</strong> — you also need a message broker between servers to route messages across your fleet</p>
</li>
<li><p><strong>Target at-least-once delivery with client-side deduplication</strong> — exactly-once is theoretically appealing but practically expensive</p>
</li>
<li><p><strong>Idempotency keys are non-negotiable</strong> — retries without them create duplicate messages</p>
</li>
<li><p><strong>Fan-out strategy depends on group size</strong> — write fan-out for small groups, read fan-out for large ones</p>
</li>
<li><p><strong>Never route media through chat servers</strong> — S3 pre-signed URLs + CloudFront keep your servers fast</p>
</li>
<li><p><strong>Presence is a separate problem</strong> — Redis TTL + pub/sub, not a database column</p>
</li>
<li><p><strong>Partition Kafka by conversation_id</strong> — preserves message ordering within a conversation</p>
</li>
<li><p><strong>DynamoDB partition key is conversation, not user</strong> — conversations are the natural unit of access in chat</p>
</li>
</ol>
<hr />
<h2>What Would You Design Differently?</h2>
<p>Every chat system has its own constraints. Some teams prioritise encryption above all else. Others need to support 10,000-member broadcast channels. Some operate in regions where push notifications are unreliable.</p>
<p>What's the trickiest chat-related engineering problem you've encountered? Drop it in the comments — I read every one.</p>
<hr />
<p><em>Follow me on</em> <a href="https://www.linkedin.com/in/manbhasin"><em>LinkedIn</em></a> <em>for weekly posts on system design, AWS, and engineering career growth.</em></p>
]]></content:encoded></item><item><title><![CDATA[Designing a Rate Limiter From Scratch — Token Bucket vs Sliding Window vs Fixed Window vs Leaky Bucket]]></title><description><![CDATA[Why Most Rate Limiter Articles Miss the Point
Search "rate limiter system design" and you'll find two kinds of articles.
The first kind gives you a surface-level overview of algorithms with no real im]]></description><link>https://blog.madhav.dev/designing-a-rate-limiter-from-scratch</link><guid isPermaLink="true">https://blog.madhav.dev/designing-a-rate-limiter-from-scratch</guid><category><![CDATA[System Design]]></category><category><![CDATA[Python]]></category><category><![CDATA[Redis]]></category><category><![CDATA[AWS]]></category><category><![CDATA[rate-limiting]]></category><category><![CDATA[distributed system]]></category><dc:creator><![CDATA[Madhav Bhasin]]></dc:creator><pubDate>Fri, 10 Apr 2026 16:46:31 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/622ac3bee552e99d82e59b17/2eb7c3c6-9cd6-4294-bc90-0dea2550e985.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Why Most Rate Limiter Articles Miss the Point</h2>
<p>Search "rate limiter system design" and you'll find two kinds of articles.</p>
<p>The first kind gives you a surface-level overview of algorithms with no real implementation details. The second gives you a Redis snippet with no explanation of why that algorithm was chosen, what its failure modes are, or how it behaves under traffic spikes.</p>
<p>Neither prepares you for the real question — which algorithm do you actually use, and when?</p>
<p>This post covers all four major rate-limiting algorithms, compares them honestly, and shows you how to implement per-user and per-IP limiting in a distributed system. By the end, you'll have a decision framework you can apply to any project — not just a definition to recite in an interview.</p>
<hr />
<h2>What Is Rate Limiting and Why Does It Matter?</h2>
<p>Rate limiting controls how many requests a client can make to your system in a given time window.</p>
<p>Without it:</p>
<ul>
<li><p>A single misbehaving client can exhaust your server resources</p>
</li>
<li><p>A credential stuffing attack tries thousands of passwords per second</p>
</li>
<li><p>A DDoS amplification attack overwhelms your upstream services</p>
</li>
<li><p>One heavy API consumer degrades the experience for everyone else</p>
</li>
</ul>
<p>Rate limiting protects your system at the edge — before expensive business logic runs.</p>
<hr />
<h2>The Four Algorithms</h2>
<h3>1. Fixed Window Counter</h3>
<p><strong>How it works:</strong></p>
<p>Divide time into fixed windows (e.g., every 60 seconds). Count requests per client per window. Reject once the count exceeds the limit.</p>
<pre><code class="language-plaintext">Window: 12:00:00 → 12:01:00  |  Limit: 100 requests
 
Client A: ████████████████ 87 requests  ✅
Client B: ████████████████████████ 100 requests  ✅ (101st rejected)
</code></pre>
<p><strong>Implementation (Python + Redis):</strong></p>
<pre><code class="language-python">import redis
import time
 
r = redis.Redis()
 
def is_allowed_fixed_window(client_id: str, limit: int, window_seconds: int) -&gt; bool:
    window_key = int(time.time() // window_seconds)
    key = f"rate:{client_id}:{window_key}"
    
    count = r.incr(key)
    if count == 1:
        r.expire(key, window_seconds)
    
    return count &lt;= limit
</code></pre>
<p><strong>Pros:</strong></p>
<ul>
<li><p>Dead simple to implement</p>
</li>
<li><p>Very low memory — one counter per client per window</p>
</li>
<li><p>Easy to reason about and debug</p>
</li>
</ul>
<p><strong>Cons — the boundary problem:</strong></p>
<p>This is the critical flaw. A client can make 100 requests at 12:00:59 and another 100 at 12:01:01 — 200 requests in 2 seconds, both within their respective windows but double the intended limit.</p>
<pre><code class="language-plaintext">Window 1 ends ──────────────────┐
                                 │
                  100 requests ──┘└── 100 requests
                                 │
Window 2 starts ─────────────────┘
 
200 requests in ~2 seconds. Window boundary exploited.
</code></pre>
<p><strong>When to use it:</strong> Internal admin tools, low-stakes APIs, anywhere simplicity matters more than precision.</p>
<hr />
<h3>2. Sliding Window Counter</h3>
<p><strong>How it works:</strong></p>
<p>A refinement of Fixed Window that weights the previous window's count based on how far through the current window you are.</p>
<pre><code class="language-plaintext">Current count = 
  (previous_window_count × overlap_ratio) + current_window_count
</code></pre>
<p><strong>Example:</strong></p>
<p>You're 25% through the current window. Previous window had 80 requests. Current window has 30 so far.</p>
<pre><code class="language-plaintext">Weighted count = (80 × 0.75) + 30 = 60 + 30 = 90
</code></pre>
<p><strong>Implementation:</strong></p>
<pre><code class="language-python">def is_allowed_sliding_window(client_id: str, limit: int, window_seconds: int) -&gt; bool:
    now = time.time()
    current_window = int(now // window_seconds)
    previous_window = current_window - 1
    
    # How far through the current window are we?
    elapsed = now % window_seconds
    overlap_ratio = 1 - (elapsed / window_seconds)
    
    current_key = f"rate:{client_id}:{current_window}"
    previous_key = f"rate:{client_id}:{previous_window}"
    
    current_count = int(r.get(current_key) or 0)
    previous_count = int(r.get(previous_key) or 0)
    
    weighted_count = (previous_count * overlap_ratio) + current_count
    
    if weighted_count &gt;= limit:
        return False
    
    count = r.incr(current_key)
    if count == 1:
        r.expire(current_key, window_seconds * 2)
    
    return True
</code></pre>
<p><strong>Pros:</strong></p>
<ul>
<li><p>Solves the boundary burst problem</p>
</li>
<li><p>Still memory-efficient — just two counters per client</p>
</li>
<li><p>Good approximation of true sliding window behaviour</p>
</li>
</ul>
<p><strong>Cons:</strong></p>
<ul>
<li><p>The weighting is an approximation — not perfectly accurate</p>
</li>
<li><p>Slightly harder to implement correctly</p>
</li>
<li><p>Edge cases around window boundaries need care</p>
</li>
</ul>
<p><strong>When to use it:</strong> Public APIs where you want fairness without the complexity of a full sliding window log.</p>
<hr />
<h3>3. Token Bucket</h3>
<p><strong>How it works:</strong></p>
<p>Each client has a bucket that holds tokens. Tokens refill at a constant rate up to a maximum capacity. Each request consumes one token. No token = request rejected.</p>
<pre><code class="language-plaintext">Bucket capacity: 10 tokens
Refill rate: 2 tokens/second
 
t=0:  [██████████] 10 tokens
t=1:  Request → [█████████ ] 9 tokens
t=1:  Request → [████████  ] 8 tokens
t=2:  Refill  → [██████████] 10 tokens (capped at max)
</code></pre>
<p><strong>Implementation:</strong></p>
<pre><code class="language-python">import time
 
def is_allowed_token_bucket(
    client_id: str,
    capacity: int,
    refill_rate: float  # tokens per second
) -&gt; bool:
    key = f"token_bucket:{client_id}"
    now = time.time()
    
    # Lua script for atomic read-modify-write
    lua_script = """
    local key = KEYS[1]
    local capacity = tonumber(ARGV[1])
    local refill_rate = tonumber(ARGV[2])
    local now = tonumber(ARGV[3])
    
    local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
    local tokens = tonumber(bucket[1]) or capacity
    local last_refill = tonumber(bucket[2]) or now
    
    -- Add tokens based on time elapsed
    local elapsed = now - last_refill
    tokens = math.min(capacity, tokens + elapsed * refill_rate)
    
    if tokens &lt; 1 then
        return 0  -- Rejected
    end
    
    tokens = tokens - 1
    redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
    redis.call('EXPIRE', key, 3600)
    return 1  -- Allowed
    """
    
    result = r.eval(lua_script, 1, key, capacity, refill_rate, now)
    return result == 1
</code></pre>
<p><strong>Why Lua?</strong> Redis executes Lua scripts atomically — essential here to prevent race conditions where two concurrent requests both read the same token count and both think they can proceed.</p>
<p><strong>Pros:</strong></p>
<ul>
<li><p>Handles bursts naturally — a client can use saved-up tokens for a short burst</p>
</li>
<li><p>Smooth traffic shaping</p>
</li>
<li><p>Widely used in production (AWS API Gateway uses token bucket internally)</p>
</li>
</ul>
<p><strong>Cons:</strong></p>
<ul>
<li><p>Two parameters to tune (capacity + refill rate) — easy to misconfigure</p>
</li>
<li><p>Slightly more complex implementation</p>
</li>
<li><p>Burst behaviour can be surprising if you're not expecting it</p>
</li>
</ul>
<p><strong>When to use it:</strong> APIs where some bursting is acceptable — e.g. a user uploading a batch of files. The token bucket lets them burst briefly without being punished for it.</p>
<hr />
<h3>4. Leaky Bucket</h3>
<p><strong>How it works:</strong></p>
<p>Requests enter a queue (the bucket). They're processed at a constant rate regardless of how fast they arrive. If the queue is full, new requests are rejected.</p>
<pre><code class="language-plaintext">Incoming requests (variable rate)
         │││││││││
         ▼▼▼▼▼▼▼▼▼
    ┌─────────────┐
    │  Queue      │ ← Max capacity: 10
    │  ▓▓▓▓▓▓▓   │
    └──────┬──────┘
           │ constant outflow
           ▼▼▼▼▼▼▼▼  (e.g. 5 req/sec)
</code></pre>
<p><strong>Implementation:</strong></p>
<pre><code class="language-python">def is_allowed_leaky_bucket(
    client_id: str,
    capacity: int,
    leak_rate: float  # requests per second
) -&gt; bool:
    key = f"leaky:{client_id}"
    now = time.time()
    
    lua_script = """
    local key = KEYS[1]
    local capacity = tonumber(ARGV[1])
    local leak_rate = tonumber(ARGV[2])
    local now = tonumber(ARGV[3])
    
    local bucket = redis.call('HMGET', key, 'queue_size', 'last_leak')
    local queue_size = tonumber(bucket[1]) or 0
    local last_leak = tonumber(bucket[2]) or now
    
    -- Drain the bucket based on elapsed time
    local elapsed = now - last_leak
    local leaked = math.floor(elapsed * leak_rate)
    queue_size = math.max(0, queue_size - leaked)
    
    if queue_size &gt;= capacity then
        return 0  -- Bucket full, reject
    end
    
    queue_size = queue_size + 1
    redis.call('HMSET', key, 'queue_size', queue_size, 'last_leak', now)
    redis.call('EXPIRE', key, 3600)
    return 1  -- Allowed
    """
    
    result = r.eval(lua_script, 1, key, capacity, leak_rate, now)
    return result == 1
</code></pre>
<p><strong>Pros:</strong></p>
<ul>
<li><p>Guarantees a perfectly smooth, consistent outflow rate</p>
</li>
<li><p>Protects downstream services from any burst whatsoever</p>
</li>
<li><p>Predictable processing rate</p>
</li>
</ul>
<p><strong>Cons:</strong></p>
<ul>
<li><p>Bursts are always rejected — even legitimate ones</p>
</li>
<li><p>Queue management adds latency</p>
</li>
<li><p>Less intuitive than token bucket</p>
</li>
</ul>
<p><strong>When to use it:</strong> Protecting downstream services that can't handle any variation in request rate — payment processors, third-party APIs with strict rate limits, legacy systems with fixed throughput.</p>
<hr />
<h2>Algorithm Comparison</h2>
<table>
<thead>
<tr>
<th>Algorithm</th>
<th>Burst Handling</th>
<th>Memory</th>
<th>Accuracy</th>
<th>Implementation</th>
<th>Best For</th>
</tr>
</thead>
<tbody><tr>
<td>Fixed Window</td>
<td>❌ Boundary exploitable</td>
<td>✅ Very low</td>
<td>⚠️ Approximate</td>
<td>✅ Simple</td>
<td>Internal tools</td>
</tr>
<tr>
<td>Sliding Window</td>
<td>✅ Good</td>
<td>✅ Low</td>
<td>✅ Good</td>
<td>⚠️ Moderate</td>
<td>Public APIs</td>
</tr>
<tr>
<td>Token Bucket</td>
<td>✅ Allows bursts</td>
<td>⚠️ Medium</td>
<td>✅ Accurate</td>
<td>⚠️ Moderate</td>
<td>User-facing APIs</td>
</tr>
<tr>
<td>Leaky Bucket</td>
<td>❌ Smooths all bursts</td>
<td>⚠️ Medium</td>
<td>✅ Accurate</td>
<td>⚠️ Moderate</td>
<td>Downstream protection</td>
</tr>
</tbody></table>
<hr />
<h2>Per-User vs Per-IP Limiting</h2>
<p>This is where most tutorials stop. Real systems need both.</p>
<h3>The Problem With IP-Only Limiting</h3>
<ul>
<li><p>Users behind corporate NAT share one IP — limit one, limit all</p>
</li>
<li><p>IPv6 makes IP-based blocking trivial to bypass (rotate addresses)</p>
</li>
<li><p>Mobile users switch IPs constantly</p>
</li>
</ul>
<h3>The Problem With User-Only Limiting</h3>
<ul>
<li><p>Unauthenticated endpoints can't use user IDs</p>
</li>
<li><p>Login endpoints need IP limiting before a user ID is even known</p>
</li>
<li><p>Credential stuffing attacks rotate user accounts</p>
</li>
</ul>
<h3>The Solution: Layered Limiting</h3>
<p>Apply both, with different limits and algorithms for each layer:</p>
<pre><code class="language-python">from fastapi import FastAPI, Request, HTTPException
from typing import Optional
 
app = FastAPI()
 
# Layer 1: IP-based (protects unauthenticated endpoints)
IP_LIMIT = 200        # requests per minute per IP
IP_WINDOW = 60
 
# Layer 2: User-based (protects authenticated endpoints)
USER_LIMIT = 100      # requests per minute per user
USER_WINDOW = 60
 
# Layer 3: Endpoint-specific (protects sensitive endpoints)
LOGIN_LIMIT = 5       # login attempts per minute per IP
LOGIN_WINDOW = 60
 
def get_client_ip(request: Request) -&gt; str:
    # Respect X-Forwarded-For if behind a load balancer
    forwarded_for = request.headers.get("X-Forwarded-For")
    if forwarded_for:
        return forwarded_for.split(",")[0].strip()
    return request.client.host
 
async def rate_limit_middleware(
    request: Request,
    user_id: Optional[str] = None
):
    client_ip = get_client_ip(request)
    endpoint = request.url.path
    
    # Layer 1: Always check IP
    if not is_allowed_sliding_window(f"ip:{client_ip}", IP_LIMIT, IP_WINDOW):
        raise HTTPException(
            status_code=429,
            detail="Too many requests from this IP",
            headers={"Retry-After": "60"}
        )
    
    # Layer 2: Check user if authenticated
    if user_id:
        if not is_allowed_token_bucket(f"user:{user_id}", USER_LIMIT, USER_LIMIT/USER_WINDOW):
            raise HTTPException(
                status_code=429,
                detail="Rate limit exceeded. Please slow down.",
                headers={"Retry-After": "60"}
            )
    
    # Layer 3: Tighter limits for sensitive endpoints
    sensitive_endpoints = ["/auth/login", "/auth/register", "/auth/reset-password"]
    if endpoint in sensitive_endpoints:
        if not is_allowed_fixed_window(f"login:{client_ip}", LOGIN_LIMIT, LOGIN_WINDOW):
            raise HTTPException(
                status_code=429,
                detail="Too many authentication attempts",
                headers={"Retry-After": "60", "X-RateLimit-Limit": str(LOGIN_LIMIT)}
            )
 
 
@app.middleware("http")
async def apply_rate_limiting(request: Request, call_next):
    user_id = request.headers.get("X-User-ID")  # Set by auth middleware upstream
    await rate_limit_middleware(request, user_id)
    response = await call_next(request)
    return response
</code></pre>
<hr />
<h2>Distributed Rate Limiting — The Hard Part</h2>
<p>A single Redis instance works fine until you have multiple API servers. The challenge: requests from the same client hit different servers, each with their own in-memory state.</p>
<h3>Architecture</h3>
<pre><code class="language-plaintext">Client
  │
  ▼
Load Balancer
  ├── API Server 1 ──┐
  ├── API Server 2 ──┼──► Redis Cluster (shared state)
  └── API Server 3 ──┘
</code></pre>
<p>All servers share a single Redis cluster for rate limit state. This works well up to very high scale.</p>
<h3>Redis Cluster Considerations</h3>
<ul>
<li><p>Use <strong>Redis Cluster</strong> (not Sentinel) for horizontal scaling</p>
</li>
<li><p>Hash slot distribution means related keys (same client) land on the same node — consistent</p>
</li>
<li><p>Use <strong>connection pooling</strong> — don't open a new Redis connection per request</p>
</li>
</ul>
<pre><code class="language-python"># Production Redis setup with connection pooling
import redis
 
pool = redis.ConnectionPool(
    host='your-redis-cluster-endpoint',
    port=6379,
    max_connections=50,
    socket_connect_timeout=1,
    socket_timeout=1
)
 
r = redis.Redis(connection_pool=pool)
</code></pre>
<h3>What Happens When Redis Goes Down?</h3>
<p>Two options and you need to decide upfront:</p>
<p><strong>Fail open</strong> — if Redis is unavailable, allow all requests. Traffic flows, no revenue impact, risk of abuse.</p>
<p><strong>Fail closed</strong> — if Redis is unavailable, reject all requests. No abuse, but service is degraded for legitimate users.</p>
<p>For most APIs, <strong>fail open</strong> with alerting is the right call. Add a circuit breaker:</p>
<pre><code class="language-python">import time
 
redis_healthy = True
last_redis_check = 0
 
def is_allowed_with_fallback(client_id: str, limit: int, window: int) -&gt; bool:
    global redis_healthy, last_redis_check
    
    # Check Redis health every 5 seconds
    if not redis_healthy and time.time() - last_redis_check &gt; 5:
        try:
            r.ping()
            redis_healthy = True
        except:
            last_redis_check = time.time()
            return True  # Fail open
    
    try:
        result = is_allowed_sliding_window(client_id, limit, window)
        redis_healthy = True
        return result
    except Exception:
        redis_healthy = False
        last_redis_check = time.time()
        return True  # Fail open — allow request, log the issue
</code></pre>
<hr />
<h2>Response Headers — Don't Forget These</h2>
<p>Always return rate limit headers so clients can self-throttle:</p>
<pre><code class="language-python">from fastapi import Response
 
def add_rate_limit_headers(
    response: Response,
    limit: int,
    remaining: int,
    reset_at: int  # Unix timestamp
):
    response.headers["X-RateLimit-Limit"] = str(limit)
    response.headers["X-RateLimit-Remaining"] = str(max(0, remaining))
    response.headers["X-RateLimit-Reset"] = str(reset_at)
    response.headers["Retry-After"] = str(reset_at - int(time.time()))
</code></pre>
<p>Well-behaved API clients use these headers to back off automatically — which means fewer retries hammering your system when limits are hit.</p>
<hr />
<h2>Where to Put the Rate Limiter</h2>
<p>You have three options:</p>
<p><strong>Option 1: API Gateway (AWS API Gateway / Kong)</strong></p>
<ul>
<li><p>Easiest to set up — configure, not code</p>
</li>
<li><p>Limited flexibility — hard to do per-user logic</p>
</li>
<li><p>Good for baseline IP protection</p>
</li>
</ul>
<p><strong>Option 2: Middleware in your application</strong></p>
<ul>
<li><p>Full control over logic</p>
</li>
<li><p>Access to user context, request data, business rules</p>
</li>
<li><p>The implementation shown above</p>
</li>
</ul>
<p><strong>Option 3: Dedicated rate limit service</strong></p>
<ul>
<li><p>Centralised policy management across multiple services</p>
</li>
<li><p>More infrastructure to maintain</p>
</li>
<li><p>Worth it at large scale (10+ services)</p>
</li>
</ul>
<p>For most teams: <strong>start with middleware, move to a dedicated service when you have 5+ services that need a consistent rate-limiting policy.</strong></p>
<hr />
<h2>Key Takeaways</h2>
<ol>
<li><p><strong>Fixed Window</strong> is fine for low-stakes internal APIs — don't over-engineer</p>
</li>
<li><p><strong>Sliding Window</strong> is the pragmatic choice for most public APIs — good accuracy, low cost</p>
</li>
<li><p><strong>Token Bucket</strong> is best when some bursting is acceptable and expected</p>
</li>
<li><p><strong>Leaky Bucket</strong> is for protecting downstream services that need perfectly smooth traffic</p>
</li>
<li><p><strong>Layer your limits</strong> — IP-level and user-level serve different purposes, use both</p>
</li>
<li><p><strong>Always use Lua scripts in Redis</strong> for rate limit operations — atomicity is not optional</p>
</li>
<li><p><strong>Decide your Redis failure mode upfront</strong> — fail open or fail closed, document it</p>
</li>
<li><p><strong>Return rate limit headers</strong> — good clients will use them and reduce your load</p>
</li>
</ol>
<hr />
<h2>What Would You Do Differently?</h2>
<p>Every system has different constraints. Some teams need per-organisation limits, per-endpoint limits, or dynamic limits that change based on subscription tier.</p>
<p>What's the most interesting rate-limiting challenge you've tackled? Drop it in the comments — I read every one.</p>
<hr />
<p><em>If you found this useful, follow me on</em> <a href="https://www.linkedin.com/in/manbhasin"><em>LinkedIn</em></a> <em>where I post about system design, AWS, and engineering career growth every week.</em></p>
]]></content:encoded></item><item><title><![CDATA[How I'd Design a URL Shortener on AWS — and What Most Tutorials Get Wrong]]></title><description><![CDATA[The Problem With Most URL Shortener Tutorials
Search "URL shortener system design" and you'll find hundreds of articles. Most of them give you the same thing: a single server, a database with a short_]]></description><link>https://blog.madhav.dev/how-i-d-design-a-url-shortener-on-aws-and-what-most-tutorials-get-wrong</link><guid isPermaLink="true">https://blog.madhav.dev/how-i-d-design-a-url-shortener-on-aws-and-what-most-tutorials-get-wrong</guid><category><![CDATA[System Design]]></category><category><![CDATA[AWS]]></category><category><![CDATA[architecture]]></category><category><![CDATA[Redis]]></category><category><![CDATA[serverless]]></category><dc:creator><![CDATA[Madhav Bhasin]]></dc:creator><pubDate>Fri, 03 Apr 2026 08:21:02 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/622ac3bee552e99d82e59b17/30b7b52e-9a46-4f96-b84e-d951da8a9429.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>The Problem With Most URL Shortener Tutorials</h2>
<p>Search "URL shortener system design" and you'll find hundreds of articles. Most of them give you the same thing: a single server, a database with a <code>short_code</code> column, and a redirect endpoint.</p>
<p>That works for a demo. It doesn't work for production.</p>
<p>The real challenge isn't building a URL shortener. It's building one that handles millions of redirects per day, stays available when things go wrong, and doesn't cost a fortune to run. Those constraints change every decision you make.</p>
<p>Here's how I'd actually design it on AWS — and the tradeoffs I'd make along the way.</p>
<hr />
<h2>Requirements First</h2>
<p>Before touching architecture, let's be explicit about what we're building.</p>
<p><strong>Functional requirements:</strong></p>
<ul>
<li><p>Given a long URL, generate a short code (e.g. <code>sho.rt/xK92p</code>)</p>
</li>
<li><p>Given a short code, redirect to the original URL</p>
</li>
<li><p>Short codes should be unique and ideally human-readable enough to share</p>
</li>
<li><p>Optional: custom aliases, expiry dates, click analytics</p>
</li>
</ul>
<p><strong>Non-functional requirements:</strong></p>
<ul>
<li><p>Reads (redirects) massively outnumber writes (URL creation) — think 100:1 ratio</p>
</li>
<li><p>Redirect latency must be low — under 50ms ideally</p>
</li>
<li><p>High availability — if this goes down, every link on the internet that uses it breaks</p>
</li>
<li><p>Short codes must not collide</p>
</li>
</ul>
<p>These constraints drive every architectural decision below.</p>
<hr />
<h2>The Naive Approach (And Why It Fails)</h2>
<p>Most tutorials suggest this:</p>
<pre><code class="language-plaintext">User → Web Server → Database (lookup short_code) → Redirect
</code></pre>
<p>Simple. Works at a low scale. Falls apart fast because:</p>
<ul>
<li><p>Every redirect hits the database — at 10,000 requests/second, your DB is the bottleneck</p>
</li>
<li><p>Single server = single point of failure</p>
</li>
<li><p>No geographic distribution — users in Sydney hitting a server in us-east-1 get 200ms+ latency</p>
</li>
</ul>
<p>Let's fix all of this.</p>
<hr />
<h2>The Architecture I'd Actually Build</h2>
<p>Here's the high-level design:</p>
<pre><code class="language-plaintext">User
 │
 ▼
CloudFront (CDN + Edge Caching)
 │
 ▼
API Gateway
 │
 ├──► Lambda (URL Creation) ──► DynamoDB
 │
 └──► Lambda (URL Redirect) ──► ElastiCache (Redis)
                                      │
                                      └──► DynamoDB (cache miss)
</code></pre>
<p>Let me walk through each component and the reasoning behind it.</p>
<hr />
<h2>Component Breakdown</h2>
<h3>1. CloudFront — Your First Line of Defense</h3>
<p>CloudFront sits in front of everything. For a URL shortener, this is critical because:</p>
<ul>
<li><p>Redirects for popular URLs get <strong>cached at the edge</strong> — a link that goes viral serves millions of requests without ever hitting your origin</p>
</li>
<li><p>400+ edge locations globally mean low latency for everyone</p>
</li>
<li><p>Built-in DDoS protection via AWS Shield Standard</p>
</li>
</ul>
<p><strong>The cache key matters here.</strong> You want to cache on the short code, not the full URL. Set a short TTL (30-60 seconds) for most URLs, longer for ones you know won't change.</p>
<p>One important caveat: <strong>don't cache 301 redirects</strong>. Browsers cache 301s permanently, which means if you ever need to update a destination URL, users with cached responses are stuck. Use <strong>302</strong> (temporary redirect) instead — CloudFront caches it, browsers don't.</p>
<hr />
<h3>2. Short Code Generation — The Part Everyone Gets Wrong</h3>
<p>This is where most designs fall apart. Common approaches:</p>
<p><strong>Option A: Auto-increment ID + Base62 encoding</strong></p>
<p>Take a database sequence (1, 2, 3...), encode it in Base62 (0-9, a-z, A-Z), and get short codes like <code>b</code>, <code>c</code>, <code>1a</code>, <code>1b</code>.</p>
<p>Problems:</p>
<ul>
<li><p>Sequential codes are predictable — someone can enumerate all your URLs</p>
</li>
<li><p>Requires a centralised counter — creates a bottleneck</p>
</li>
</ul>
<p><strong>Option B: Random UUID + truncate</strong></p>
<p>Generate a UUID, take the first 7 characters. Simple. But collision probability increases as your dataset grows, and you need collision checks on every write.</p>
<p><strong>Option C: What I'd actually use — Snowflake-style ID + Base62</strong></p>
<p>Generate a time-ordered unique ID (similar to Twitter's Snowflake) and encode it. You get:</p>
<ul>
<li><p>Roughly time-ordered codes (good for debugging)</p>
</li>
<li><p>Extremely low collision probability without a central counter</p>
</li>
<li><p>7-character codes that support ~3.5 trillion unique URLs</p>
</li>
</ul>
<p>On AWS, you can implement this in Lambda with a combination of timestamp + random bits + worker ID (derived from the Lambda execution environment).</p>
<hr />
<h3>3. DynamoDB — The Right Database for This Problem</h3>
<p>Why DynamoDB over PostgreSQL or MySQL?</p>
<ul>
<li><p><strong>Access pattern is simple</strong>: you're almost always looking up by <code>short_code</code>. DynamoDB is optimised for exactly this.</p>
</li>
<li><p><strong>Scales horizontally</strong> without configuration — you don't manage sharding</p>
</li>
<li><p><strong>Single-digit millisecond latency</strong> at any scale</p>
</li>
<li><p><strong>On-demand capacity</strong> means you only pay for what you use — perfect for spiky traffic</p>
</li>
</ul>
<p><strong>Table design:</strong></p>
<pre><code class="language-plaintext">Table: urls
Partition Key: short_code (String)

Attributes:
- original_url (String)
- created_at (Number — Unix timestamp)
- expires_at (Number — TTL attribute, DynamoDB auto-deletes)
- created_by (String — user ID if authenticated)
- click_count (Number — updated asynchronously)
</code></pre>
<p>Set <code>expires_at</code> As your TTL attribute and DynamoDB handles expiry automatically — no cron jobs needed.</p>
<hr />
<h3>4. ElastiCache (Redis) — Making Redirects Fast</h3>
<p>Even DynamoDB at ~5ms is too slow if you want sub-50ms redirects globally. The solution: cache the short_code → original_url mapping in Redis.</p>
<p><strong>Cache strategy:</strong> Cache-aside (lazy loading)</p>
<pre><code class="language-plaintext">1. Request comes in for /xK92p
2. Check Redis — cache hit → redirect immediately (~1ms)
3. Cache miss → query DynamoDB → cache result → redirect (~10ms)
</code></pre>
<p><strong>What to cache:</strong> Only redirects, not the creation flow. The creation flow is rare; redirects are constant.</p>
<p><strong>TTL:</strong> Set Redis TTL to match your use case. For general URLs, 24 hours is reasonable. For URLs you know are temporary, match the expiry.</p>
<p><strong>Cache invalidation:</strong> If a user deletes or updates a URL, invalidate the Redis key immediately. Don't wait for TTL expiry.</p>
<hr />
<h3>5. Lambda — Keeping Infrastructure Costs Low</h3>
<p>Two Lambda functions handle the core logic:</p>
<p><strong>URL Creation Lambda:</strong></p>
<ul>
<li><p>Validates the input URL (is it actually a URL? Is it safe?)</p>
</li>
<li><p>Generates the short code</p>
</li>
<li><p>Writes to DynamoDB</p>
</li>
<li><p>Returns the short URL</p>
</li>
</ul>
<p><strong>URL Redirect Lambda:</strong></p>
<ul>
<li><p>Checks Redis cache</p>
</li>
<li><p>Falls back to DynamoDB on cache miss</p>
</li>
<li><p>Returns a 302 redirect</p>
</li>
</ul>
<p>Why Lambda over EC2 or ECS? For this workload, Lambda is ideal:</p>
<ul>
<li><p>You pay per request, not per idle server</p>
</li>
<li><p>Auto-scales to millions of requests without configuration</p>
</li>
<li><p>No servers to manage or patch</p>
</li>
</ul>
<p>The one concern with Lambda is cold starts. For the redirect function specifically, cold starts add latency. Mitigate this with <strong>Provisioned Concurrency</strong> on the redirect Lambda — keep a pool of warm instances ready.</p>
<hr />
<h2>The Analytics Problem</h2>
<p>Most tutorials ignore click analytics entirely. But it's one of the most requested features.</p>
<p>The naive approach — incrementing a counter in DynamoDB on every redirect — is dangerous at scale. At 10,000 redirects/second on one URL, you'll hit DynamoDB write throughput limits and slow down redirects.</p>
<p><strong>Better approach:</strong> Decouple analytics from the redirect path.</p>
<pre><code class="language-plaintext">Redirect Lambda
 │
 └──► Kinesis Data Firehose (async, non-blocking)
              │
              ▼
           S3 (raw events)
              │
              ▼
        Athena (query analytics on demand)
</code></pre>
<p>The redirect happens immediately. The analytics event is fired asynchronously. The user never waits for analytics to complete.</p>
<p>For real-time dashboards, add a stream processor between Kinesis and a time-series store like DynamoDB or InfluxDB.</p>
<hr />
<h2>Handling Scale: The Numbers</h2>
<p>Let's validate this architecture against real numbers.</p>
<p><strong>Assumptions:</strong></p>
<ul>
<li><p>100 million redirects per day</p>
</li>
<li><p>1 million new URLs created per day</p>
</li>
<li><p>Average URL size: 200 bytes</p>
</li>
</ul>
<p><strong>Storage:</strong> 1M URLs/day × 200 bytes × 365 days = ~73GB/year. DynamoDB handles this trivially.</p>
<p><strong>Redirect throughput:</strong> 100M/day = <del>1,160 requests/second average, with peaks 10x higher (</del>12,000 rps). With CloudFront caching popular URLs, most of this never hits your Lambda functions.</p>
<p><strong>Cost estimate (rough):</strong></p>
<ul>
<li><p>CloudFront: ~$10-15/month at this scale</p>
</li>
<li><p>Lambda: ~$20-30/month</p>
</li>
<li><p>DynamoDB (on-demand): ~$50-100/month depending on cache hit rate</p>
</li>
<li><p>ElastiCache (small instance): ~$30/month</p>
</li>
</ul>
<p><strong>Total: ~$100-150/month</strong> for a system handling 100 million redirects per day. That's the power of serverless-first architecture.</p>
<hr />
<h2>What I'd Skip in V1</h2>
<p>Not everything needs to be built on day one. Here's what I'd defer:</p>
<ul>
<li><p><strong>Custom domains</strong> (e.g. <code>company.com/abc</code>) — complex to implement, add later</p>
</li>
<li><p><strong>Link previews / Open Graph</strong> — nice to have, not essential</p>
</li>
<li><p><strong>Real-time analytics dashboard</strong> — build the data pipeline first, the UI later</p>
</li>
<li><p><strong>Multi-region active-active</strong> — unless you have users on multiple continents from day one, start single-region and expand</p>
</li>
</ul>
<p>The architecture above scales to hundreds of millions of requests before you need to rethink it. That buys you time to learn what your users actually need before over-engineering.</p>
<hr />
<h2>Key Takeaways</h2>
<ol>
<li><p><strong>Reads dominate writes</strong> — design your caching strategy around the redirect path, not the creation path</p>
</li>
<li><p><strong>Use 302, not 301</strong> — browser caching of 301s will haunt you</p>
</li>
<li><p><strong>Decouple analytics</strong> — never let analytics slow down the critical path</p>
</li>
<li><p><strong>Short code generation is harder than it looks</strong> — think carefully about collision probability and enumeration attacks</p>
</li>
<li><p><strong>CloudFront is doing more work than your servers</strong> — invest time in your cache invalidation strategy</p>
</li>
</ol>
<hr />
<h2>What Would You Change?</h2>
<p>This is one way to approach it — not the only way. I've made deliberate tradeoffs here that might not fit every use case.</p>
<p>What would you design differently? Would you choose a different database, a different caching strategy, or a different approach to short code generation?</p>
<p>Drop your thoughts in the comments — I read every one.</p>
<hr />
<p><em>If you found this useful, follow me on</em> <a href="https://www.linkedin.com/in/manbhasin"><em>LinkedIn</em></a> <em>where I post about system design, AWS, and engineering career growth every week.</em></p>
]]></content:encoded></item><item><title><![CDATA[Deploying Production-Ready Infrastructure on AWS in Minutes: A Developer’s Guide]]></title><description><![CDATA[The Challenge: Infrastructure Overhead Slows Down Delivery
For most engineers, shipping an application to production is more difficult than building it. The friction lies not in writing code but in setting up a reliable, secure, and maintainable infr...]]></description><link>https://blog.madhav.dev/deploying-production-ready-infrastructure-on-aws-in-minutes-a-developers-guide</link><guid isPermaLink="true">https://blog.madhav.dev/deploying-production-ready-infrastructure-on-aws-in-minutes-a-developers-guide</guid><category><![CDATA[AWS]]></category><category><![CDATA[Terraform]]></category><category><![CDATA[Infrastructure as code]]></category><category><![CDATA[Developer]]></category><category><![CDATA[startup]]></category><category><![CDATA[Time management]]></category><dc:creator><![CDATA[Madhav Bhasin]]></dc:creator><pubDate>Mon, 06 Oct 2025 05:27:10 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1759728285662/4e3dd279-101c-4475-8a4e-5a11fa032cef.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-the-challenge-infrastructure-overhead-slows-down-delivery">The Challenge: Infrastructure Overhead Slows Down Delivery</h2>
<p>For most engineers, shipping an application to production is more difficult than building it. The friction lies not in writing code but in setting up a reliable, secure, and maintainable infrastructure. Typical requirements include:</p>
<ul>
<li><p>Configuring VPC networking and security groups</p>
</li>
<li><p>Provisioning an RDS PostgreSQL instance with encryption and backups</p>
</li>
<li><p>Deploying containerized workloads on ECS Fargate with auto-scaling</p>
</li>
<li><p>Managing load balancers, SSL termination, and health checks</p>
</li>
<li><p>Setting up observability with monitoring and logging</p>
</li>
<li><p>Running database migrations in a controlled manner</p>
</li>
<li><p>Applying security best practices across all layers</p>
</li>
<li><p>Keeping infrastructure costs predictable and under control</p>
</li>
</ul>
<p>Without standardized practices, this process often extends into weeks of experimentation and patchwork fixes—time that could be better spent improving the application itself.</p>
<hr />
<h2 id="heading-the-approach-infrastructure-as-code-ready-for-production">The Approach: Infrastructure as Code, Ready for Production</h2>
<p>This solution encapsulates <strong>production-grade AWS infrastructure</strong> in a set of Terraform modules, designed to be deployed with a single command:</p>
<pre><code class="lang-bash">terraform apply -var-file=environments/stage/stage.tfvars
</code></pre>
<p>The result: a <strong>repeatable, secure, and auto-scaling deployment</strong> that can be provisioned in under 10 minutes.</p>
<hr />
<h2 id="heading-architecture-overview">Architecture Overview</h2>
<h3 id="heading-compute-layer">Compute Layer</h3>
<ul>
<li><p><strong>ECS Fargate</strong> for container workloads (serverless execution, no cluster management)</p>
</li>
<li><p><strong>Auto-scaling policies</strong> triggered by CPU utilization</p>
</li>
<li><p><strong>Zero-downtime deployments</strong> via rolling updates</p>
</li>
<li><p><strong>Multi-AZ support</strong> for resilience</p>
</li>
<li><p><strong>Configurable health checks</strong> for seamless load balancer integration</p>
</li>
</ul>
<h3 id="heading-database-layer">Database Layer</h3>
<ul>
<li><p><strong>PostgreSQL 16.3</strong> on Amazon RDS</p>
</li>
<li><p><strong>Encryption at rest</strong> with AWS KMS</p>
</li>
<li><p><strong>Automated daily backups</strong> with point-in-time recovery (7-day retention)</p>
</li>
<li><p><strong>Storage auto-scaling</strong> up to 100 GB</p>
</li>
<li><p><strong>IAM authentication</strong> for secure developer access</p>
</li>
<li><p><strong>Single-AZ (default)</strong> for cost efficiency, with Multi-AZ as an option</p>
</li>
</ul>
<h3 id="heading-networking-amp-security">Networking &amp; Security</h3>
<ul>
<li><p><strong>Custom VPC</strong> with public and private subnets across two Availability Zones</p>
</li>
<li><p><strong>Security groups</strong> applying least-privilege principles</p>
</li>
<li><p><strong>NAT Gateway</strong> for outbound internet traffic isolation</p>
</li>
<li><p><strong>Application Load Balancer</strong> with optional HTTPS support</p>
</li>
<li><p><strong>Private subnets</strong> isolating application and database resources</p>
</li>
</ul>
<h3 id="heading-devops-amp-monitoring">DevOps &amp; Monitoring</h3>
<ul>
<li><p><strong>ECR</strong> as a private container registry with vulnerability scanning</p>
</li>
<li><p><strong>CloudWatch</strong> for centralized metrics, logs, and alarms</p>
</li>
<li><p><strong>CI/CD integrations</strong> with GitHub Actions and GitLab CI/CD pipelines</p>
</li>
<li><p><strong>Database migrations</strong> run as ECS tasks, optionally using Spot capacity for cost savings</p>
</li>
<li><p><strong>Terraform workspaces</strong> supporting multiple environments (dev, staging, prod)</p>
</li>
</ul>
<hr />
<h2 id="heading-cost-analysis-us-east-1-reference">Cost Analysis (us-east-1 reference)</h2>
<p>This setup delivers production-grade reliability for <strong>~$85–$120/month</strong> depending on workload and scaling.</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Component</td><td>Monthly Cost</td><td>Notes</td></tr>
</thead>
<tbody>
<tr>
<td>RDS PostgreSQL</td><td>$25–35</td><td>db.t4g.small, encrypted, auto-scaling</td></tr>
<tr>
<td>ECS Fargate</td><td>$15–25</td><td>1–2 tasks, CPU/memory-based scaling</td></tr>
<tr>
<td>Application Load Balancer</td><td>$20–25</td><td>Production-ready load balancing</td></tr>
<tr>
<td>VPC &amp; Networking</td><td>$35–40</td><td>NAT Gateway + security groups</td></tr>
<tr>
<td>ECR &amp; CloudWatch</td><td>$5–10</td><td>Image registry + monitoring</td></tr>
<tr>
<td><strong>Total</strong></td><td><strong>$85–120</strong></td><td>Complete production setup</td></tr>
</tbody>
</table>
</div><p>Optimizations include single NAT Gateway usage, Spot ECS tasks for migrations, and right-sized defaults. Regional pricing variations apply (e.g., NAT Gateway costs in eu-west-1 are higher than in us-east-1).</p>
<hr />
<h2 id="heading-security-model">Security Model</h2>
<p>Security is applied consistently at multiple layers:</p>
<ul>
<li><p><strong>Network-level isolation</strong>: workloads and databases run in private subnets with no direct internet access.</p>
</li>
<li><p><strong>Access control</strong>: IAM roles scoped to minimal privileges, IAM-based DB authentication, image scanning in ECR.</p>
</li>
<li><p><strong>Data protection</strong>: RDS encryption at rest, TLS in transit, automated backups with deletion protection.</p>
</li>
</ul>
<hr />
<h2 id="heading-developer-experience">Developer Experience</h2>
<h3 id="heading-one-command-deployment">One-Command Deployment</h3>
<pre><code class="lang-bash">terraform apply -var-file=environments/prod/prod.tfvars
</code></pre>
<h3 id="heading-cicd-integration">CI/CD Integration</h3>
<ul>
<li><p>Example pipelines included for GitHub Actions and GitLab CI/CD</p>
</li>
<li><p>Infrastructure changes and application deployments handled in the same workflows</p>
</li>
<li><p>Automated DB migrations via ECS task definitions</p>
</li>
</ul>
<h3 id="heading-multi-environment-support">Multi-Environment Support</h3>
<pre><code class="lang-bash">terraform workspace new dev
terraform workspace select prod
terraform apply -var-file=environments/prod/prod.tfvars
</code></pre>
<h3 id="heading-configurability">Configurability</h3>
<p>All parameters—VPC CIDR, subnets, ECS task sizes, database classes, retention policies—are exposed via <code>tfvars</code> with sensible, secure defaults.</p>
<hr />
<h2 id="heading-modularity-and-maintainability">Modularity and Maintainability</h2>
<p>Infrastructure is decomposed into reusable Terraform modules:</p>
<pre><code class="lang-plaintext">modules/
├── vpc/      # Networking
├── rds/      # Database
├── ecs/      # Compute
├── ecr/      # Container registry
├── alb/      # Load balancing
└── logging/  # Monitoring
</code></pre>
<p>This structure encourages reusability across projects, versioning through Git, and collaborative workflows with remote state.</p>
<hr />
<h2 id="heading-monitoring-amp-observability">Monitoring &amp; Observability</h2>
<ul>
<li><p><strong>CloudWatch metrics and alarms</strong> for ECS and RDS thresholds</p>
</li>
<li><p><strong>Centralized logging</strong> for ECS tasks and ALB</p>
</li>
<li><p><strong>Health checks</strong> for applications and databases</p>
</li>
<li><p><strong>Configurable retention</strong> to balance visibility with cost</p>
</li>
</ul>
<hr />
<h2 id="heading-practical-benefits">Practical Benefits</h2>
<ul>
<li><p><strong>For Individual Developers</strong>: Focus on building features instead of setting up networking and databases.</p>
</li>
<li><p><strong>For Teams</strong>: Consistent environments across dev/staging/prod, reducing onboarding time and configuration drift.</p>
</li>
<li><p><strong>For Startups</strong>: Production-quality infrastructure without dedicated DevOps resources, with predictable costs.</p>
</li>
</ul>
<hr />
<h2 id="heading-getting-started">Getting Started</h2>
<p><strong>Prerequisites</strong>: AWS CLI, Terraform (≥1.6.0), Docker, and an S3 bucket for remote state.</p>
<pre><code class="lang-bash">git <span class="hljs-built_in">clone</span> &lt;repo-url&gt;
<span class="hljs-built_in">cd</span> infra
terraform init -backend-config=environments/stage/backend.conf
terraform workspace new stage
terraform apply -var-file=environments/stage/stage.tfvars
</code></pre>
<p>Outputs include ALB DNS name, ECR repository URL, and RDS endpoint for application integration.</p>
<hr />
<h2 id="heading-extending-the-foundation">Extending the Foundation</h2>
<p>Common extensions include Redis/ElastiCache for caching, S3 for object storage, SES for email, and SQS/SNS for messaging. The design also supports scaling to multi-region deployments, CDN integration with CloudFront, and Kubernetes via EKS.</p>
<hr />
<h2 id="heading-repository">Repository</h2>
<p>You can find the full source code here: <a target="_blank" href="https://gitlab.com/aws-codeshare/backend-production-infra">Gitlab</a> <a target="_blank" href="https://github.com/username/aws-infra">Repository</a></p>
<hr />
<h2 id="heading-conclusion">Conclusion</h2>
<p>Provisioning production-ready AWS infrastructure is no longer a multi-week effort. With this Terraform-based approach, developers can:</p>
<ul>
<li><p>Deploy in minutes with one command</p>
</li>
<li><p>Operate within a cost-efficient range (~$85–$120/month)</p>
</li>
<li><p>Inherit AWS security best practices out of the box</p>
</li>
<li><p>Scale seamlessly from development to production environments</p>
</li>
</ul>
<p>This setup is not just infrastructure—it’s a framework for <strong>developer velocity and operational reliability</strong>.</p>
]]></content:encoded></item><item><title><![CDATA[Small Commits, Big Wins: How Atomic Changes Transform Developer Life]]></title><description><![CDATA[Originally published on HackerNoon
It's 3 PM on a Wednesday. You're deep in the zone, crushing that new feature that's been on your backlog for weeks. Lines of code are flowing like poetry, your IDE is humming with activity, and you're making progres...]]></description><link>https://blog.madhav.dev/small-commits-big-wins-how-atomic-changes-transform-developer-life</link><guid isPermaLink="true">https://blog.madhav.dev/small-commits-big-wins-how-atomic-changes-transform-developer-life</guid><category><![CDATA[development]]></category><category><![CDATA[commit]]></category><category><![CDATA[Developer]]></category><category><![CDATA[wfh]]></category><dc:creator><![CDATA[Madhav Bhasin]]></dc:creator><pubDate>Fri, 29 Aug 2025 04:58:51 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1756443391224/ae644cd7-c2af-40f8-accc-ff58e65cc3b5.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Originally published on HackerNoon</em></p>
<p>It's 3 PM on a Wednesday. You're deep in the zone, crushing that new feature that's been on your backlog for weeks. Lines of code are flowing like poetry, your IDE is humming with activity, and you're making progress on multiple fronts simultaneously. You've refactored three components, fixed two bugs, updated the documentation, and added that shiny new feature your product manager has been asking for.</p>
<p>Then disaster strikes.</p>
<p>Your teammate Sarah messages you: "Hey, can you quickly revert that authentication bug fix from yesterday? It's causing issues in production."</p>
<p>You freeze. Yesterday's commit? That was part of your massive 47-file, 2,847-line commit that included the bug fix, but also the complete redesign of the user dashboard, database schema changes, and a refactor of the entire notification system.</p>
<p>Sound familiar? If you've been developing for more than a week, you've probably lived this nightmare.</p>
<h2 id="heading-the-monolithic-commit-monster">The Monolithic Commit Monster</h2>
<p>We've all been there. The commit message reads something like "Fixed stuff and added features" with a diff that spans across dozens of files. It's the developer equivalent of shoving everything into your bedroom closet before guests arrive – it might look clean from the outside, but good luck finding anything later.</p>
<p>Here's what a typical "monster commit" looks like:</p>
<pre><code class="lang-plaintext">commit a1b2c3d4e5f6789...
Author: John Developer &lt;john@example.com&gt;
Date: Tue Oct 15 18:45:22 2024 -0700

    Fixed bugs and improved UI

    Modified files:
    - src/components/UserDashboard.js (156 additions, 89 deletions)
    - src/components/Navigation.js (67 additions, 23 deletions)  
    - src/api/auth.js (45 additions, 12 deletions)
    - src/styles/main.css (234 additions, 156 deletions)
    - database/migrations/add_user_preferences.sql (23 additions, 0 deletions)
    - README.md (12 additions, 3 deletions)
    - package.json (3 additions, 1 deletion)
    ... and 23 more files
</code></pre>
<p>This commit is a debugging nightmare waiting to happen. What exactly did it fix? Which UI improvements were made? If something breaks, what do you revert? It's like trying to untangle Christmas lights in the dark.</p>
<h2 id="heading-enter-the-atomic-commit-philosophy">Enter the Atomic Commit Philosophy</h2>
<p>The atomic commit approach is beautifully simple: <strong>one logical change per commit</strong>. Each commit should represent a single, complete, and reversible change that makes sense on its own.</p>
<p>Instead of the monster above, imagine this sequence:</p>
<pre><code class="lang-plaintext">commit f1e2d3c4b5a6...
Author: John Developer &lt;john@example.com&gt;
Date: Tue Oct 15 14:30:22 2024 -0700

    Fix authentication timeout bug in login flow

    - Increase session timeout from 30 minutes to 2 hours
    - Add proper error handling for expired sessions
    - Update auth middleware to handle timeout gracefully

    Fixes: #1247

commit e1d2c3b4a5f6...
Author: John Developer &lt;john@example.com&gt;
Date: Tue Oct 15 15:15:33 2024 -0700

    Improve navigation menu accessibility

    - Add ARIA labels to all navigation items
    - Implement keyboard navigation support
    - Increase color contrast for better visibility

    Closes: #1156

commit d1c2b3a4f5e6...
Author: John Developer &lt;john@example.com&gt;
Date: Tue Oct 15 16:45:12 2024 -0700

    Add user preference persistence to database

    - Create user_preferences table migration
    - Add UserPreference model with validation
    - Implement CRUD operations for preferences

    Related: #1203
</code></pre>
<p>Now when Sarah asks you to revert that authentication fix, you can cherry-pick exactly what needs to be undone without affecting anything else.</p>
<h2 id="heading-real-world-benefits-in-action">Real-World Benefits in Action</h2>
<p>Here's how atomic commits transform common development scenarios:</p>
<p><strong>Code Reviews</strong>: Instead of reviewing one 400-line mixed change, you review five focused commits with clear purposes. Reviews become faster and feedback more targeted.</p>
<p><strong>Debugging</strong>: When QA reports an avatar display bug, <code>git log --grep="avatar"</code> immediately shows the four avatar-related commits. You can pinpoint the issue in minutes instead of hours.</p>
<p><strong>Hotfix Deployment</strong>: Production has a payment bug, but the release also contains new features already announced to customers. With atomic commits, you revert just the problematic "Optimize payment validation" commit while keeping everything else.</p>
<p><strong>Feature Management</strong>: Your PM wants the new search filters in tomorrow's release, but not the redesigned results page. Clean commits let you cherry-pick exactly what's needed in minutes.</p>
<h2 id="heading-key-technical-advantages">Key Technical Advantages</h2>
<p><strong>Git Bisect</strong>: Find bugs faster by testing logical changes instead of mixed commits. Each bisect step tells you exactly what functionality was added.</p>
<p><strong>Cleaner Conflicts</strong>: Atomic commits create focused merge conflicts that are easier to resolve intelligently.</p>
<p><strong>Flexible History</strong>: Easily reorder commits, split branches at logical boundaries, or create targeted backports with <code>git rebase -i</code>.</p>
<h2 id="heading-practical-implementation-strategies">Practical Implementation Strategies</h2>
<h3 id="heading-the-staging-area-is-your-friend">The Staging Area is Your Friend</h3>
<p>Use <code>git add -p</code> to stage changes interactively. This lets you commit related changes while leaving other modifications for separate commits:</p>
<pre><code class="lang-bash"><span class="hljs-comment"># You've modified authentication.js and user-profile.js</span>
git add -p authentication.js  <span class="hljs-comment"># Stage only auth-related changes</span>
git commit -m <span class="hljs-string">"Fix session timeout handling"</span>

git add user-profile.js
git commit -m <span class="hljs-string">"Add profile picture upload validation"</span>
</code></pre>
<h3 id="heading-the-wip-commit-workflow">The WIP Commit Workflow</h3>
<p>Don't let the fear of "imperfect" commits stop you. Use Work In Progress (WIP) commits freely:</p>
<pre><code class="lang-bash">git commit -m <span class="hljs-string">"WIP: Start implementing user search"</span>
<span class="hljs-comment"># Continue coding</span>
git commit -m <span class="hljs-string">"WIP: Add search backend logic"</span>
<span class="hljs-comment"># More coding  </span>
git commit -m <span class="hljs-string">"WIP: Connect frontend to search API"</span>
</code></pre>
<p>Before pushing or opening a pull request, use <code>git rebase -i</code> to clean up your WIP commits into logical, atomic changes.</p>
<h3 id="heading-commit-message-templates">Commit Message Templates</h3>
<p>Create a commit message template to maintain consistency:</p>
<pre><code class="lang-plaintext"># ~/.gitmessage
# [type]: [short description]
#
# [longer description if needed]
#
# Fixes: #[issue number]
# Closes: #[issue number]  
# Related: #[issue number]

# Types: feat, fix, docs, style, refactor, test, chore
</code></pre>
<p>Configure git to use it: <code>git config commit.template ~/.gitmessage</code></p>
<h3 id="heading-commit-message-templates-1">Commit Message Templates</h3>
<p>Create a commit message template to maintain consistency:</p>
<pre><code class="lang-plaintext"># ~/.gitmessage
# [type]: [short description]
#
# [longer description if needed]
#
# Fixes: #[issue number]
# Closes: #[issue number]  
# Related: #[issue number]

# Types: feat, fix, docs, style, refactor, test, chore
</code></pre>
<p>Configure git to use it: <code>git config commit.template ~/.gitmessage</code></p>
<h2 id="heading-common-objections-and-why-theyre-wrong">Common Objections (And Why They're Wrong)</h2>
<h3 id="heading-it-takes-too-much-time">"It Takes Too Much Time"</h3>
<p><strong>Reality</strong>: You're going to spend time organizing your changes anyway – either upfront (with atomic commits) or later (when debugging, reviewing, or reverting). Atomic commits front-load a small amount of effort to save massive amounts of time later.</p>
<h3 id="heading-my-changes-are-too-interconnected">"My Changes Are Too Interconnected"</h3>
<p><strong>Reality</strong>: If your changes truly can't be separated, that might indicate a design problem. Most "interconnected" changes can be broken down:</p>
<ol>
<li><p>Add new functionality without using it</p>
</li>
<li><p>Refactor existing code to support the new functionality</p>
</li>
<li><p>Connect the new and existing functionality</p>
</li>
<li><p>Remove old/deprecated code</p>
</li>
</ol>
<h3 id="heading-code-reviews-will-have-too-many-commits">"Code Reviews Will Have Too Many Commits"</h3>
<p><strong>Reality</strong>: Reviewers prefer many small, focused commits over one large, mixed commit. It's easier to review five 50-line commits than one 250-line commit.</p>
<h2 id="heading-building-the-habit">Building the Habit</h2>
<h3 id="heading-start-small">Start Small</h3>
<p>Begin by committing more frequently, even if the commits aren't perfect. You can always clean them up later with <code>git rebase -i</code>.</p>
<h3 id="heading-use-a-timer">Use a Timer</h3>
<p>Set a timer for 25-30 minutes. When it goes off, assess if you have something worth committing. This creates a natural rhythm and prevents hours-long coding sessions without commits.</p>
<h3 id="heading-practice-the-squash-and-merge-workflow">Practice the Squash and Merge Workflow</h3>
<p>Many teams use "squash and merge" for pull requests, which can make developers lazy about commit hygiene. Fight this by treating your feature branch commits as a story for your reviewers, then squash only at merge time.</p>
<h3 id="heading-review-your-own-commits">Review Your Own Commits</h3>
<p>Before pushing, run <code>git log --oneline -10</code> and ask yourself:</p>
<ul>
<li><p>Can I understand what each commit does?</p>
</li>
<li><p>Would I be comfortable reverting any individual commit?</p>
</li>
<li><p>Do the commit messages tell a coherent story?</p>
</li>
</ul>
<h2 id="heading-the-long-term-impact">The Long-Term Impact</h2>
<p>After six months of practicing atomic commits, here's what developers typically report:</p>
<ul>
<li><p><strong>Debugging time reduced by 40-60%</strong>: Finding the source of bugs becomes systematic rather than exploratory</p>
</li>
<li><p><strong>Code review cycles shortened</strong>: Reviewers can provide more targeted, useful feedback</p>
</li>
<li><p><strong>Deployment confidence increased</strong>: Teams can deploy partial features or quickly revert problematic changes</p>
</li>
<li><p><strong>Team collaboration improved</strong>: Clear commit history serves as documentation of decision-making</p>
</li>
<li><p><strong>Technical debt reduced</strong>: Small, focused changes are less likely to introduce subtle bugs</p>
</li>
</ul>
<h2 id="heading-conclusion-your-future-self-will-thank-you">Conclusion: Your Future Self Will Thank You</h2>
<p>Every monolithic commit is a small act of cruelty toward your future self. You're essentially saying, "Figure it out later, buddy" to the developer who will need to debug, modify, or revert your changes.</p>
<p>Atomic commits are an act of kindness – to your teammates, to your future self, and to anyone who will maintain your code. They transform your git history from a chaotic timeline of mixed changes into a clear narrative of intentional decisions.</p>
<p>The next time you're about to commit 15 files with a message like "Various fixes and improvements," stop. Take five minutes to separate your changes into logical commits. Your 3 PM Wednesday self – the one dealing with production issues and impossible deadlines – will thank you.</p>
<p>Start tomorrow. Make your next commit atomic. Your development life will never be the same.</p>
<hr />
<p><em>What's your biggest atomic commit success story? Share it in the comments below, and let's help more developers discover the joy of clean git history.</em></p>
]]></content:encoded></item><item><title><![CDATA[From Hero Developer to Team Player: Breaking the Cowboy Coding Habit]]></title><description><![CDATA[Most of us have either worked with or been that developer — the one who jumps in at the last minute, writes hundreds of lines of code in a burst of energy, and “saves” the project.
It feels good — being the person everyone turns to when things get to...]]></description><link>https://blog.madhav.dev/from-hero-developer-to-team-player-breaking-the-cowboy-coding-habit</link><guid isPermaLink="true">https://blog.madhav.dev/from-hero-developer-to-team-player-breaking-the-cowboy-coding-habit</guid><category><![CDATA[team]]></category><category><![CDATA[Developer]]></category><category><![CDATA[developer relations]]></category><dc:creator><![CDATA[Madhav Bhasin]]></dc:creator><pubDate>Fri, 29 Aug 2025 04:54:19 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1756443076400/ebb18f47-ed4f-4c21-94ab-506c7cdf45b7.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Most of us have either worked with or been <strong>that developer</strong> — the one who jumps in at the last minute, writes hundreds of lines of code in a burst of energy, and “saves” the project.</p>
<p>It feels good — being the person everyone turns to when things get tough. It feels like impact.</p>
<p>But here’s what I’ve learned over the years: <strong>being the hero often does more harm than good</strong> — both to the team and to your own growth.</p>
<p>This post is about why that “lone wolf” style doesn’t scale, and what shifting away from it actually looks like in practice.</p>
<hr />
<h2 id="heading-why-the-lone-hero-model-breaks-down">Why the Lone Hero Model Breaks Down</h2>
<p>There’s no doubt that individual brilliance has a place in software development. But when everything revolves around one person:</p>
<h3 id="heading-knowledge-silos-form">Knowledge Silos Form</h3>
<p>Critical parts of the codebase become “owned” by one person. When they’re away, everything slows down — not because others aren’t capable, but because nobody else has the full picture.</p>
<h3 id="heading-technical-debt-sneaks-in">Technical Debt Sneaks In</h3>
<p>Working solo often means fewer reviews and less feedback. Temporary hacks stay in place. Clever shortcuts turn into long-term headaches.</p>
<h3 id="heading-career-growth-gets-stuck">Career Growth Gets Stuck</h3>
<p>It’s counterintuitive, but the more indispensable you make yourself, the harder it is to move forward. Senior and lead roles are about <strong>enabling others</strong>, not just being the one who can “do it all.”</p>
<hr />
<h2 id="heading-signs-you-might-be-stuck-in-cowboy-mode">Signs You Might Be Stuck in Cowboy Mode</h2>
<p>This isn’t always obvious — it hides behind urgency and the desire to move fast. Some red flags:</p>
<ul>
<li><p><strong>Thinking “faster alone” is always better</strong>:</p>
<blockquote>
<p>“It’ll take too long to explain.”<br />“I’ll just handle this now and loop them in later.”</p>
</blockquote>
</li>
<li><p><strong>Unhelpful commits</strong>:</p>
<pre><code class="lang-plaintext">  commit abc123...
  fix stuff
</code></pre>
</li>
<li><p><strong>Sparse documentation</strong>:<br />  READMEs that are outdated or missing. Functions like <code>doThing()</code> with no explanation why.</p>
</li>
<li><p><strong>Skipping discussions</strong>:<br />  Too busy coding to join design reviews or planning sessions.</p>
</li>
</ul>
<hr />
<h2 id="heading-what-shifting-away-from-this-looks-like">What Shifting Away from This Looks Like</h2>
<p>Moving away from “hero mode” doesn’t mean you stop coding. It means you <strong>stop being the single point of success or failure</strong>.</p>
<h3 id="heading-work-in-the-open">Work in the Open</h3>
<p>Instead of disappearing to fix something, share your process:</p>
<pre><code class="lang-markdown"><span class="hljs-section">## Debug Notes: User Sessions Timing Out</span>

<span class="hljs-strong">**What we saw**</span>: Users logged out after ~15 min  
<span class="hljs-strong">**What we checked**</span>:  
<span class="hljs-bullet">-</span> Session timeout (OK: 2h)  
<span class="hljs-bullet">-</span> Redis connections (fine)  
<span class="hljs-bullet">-</span> Load balancer (sticky sessions were off)  

<span class="hljs-strong">**Fix**</span>: Switched to Redis-backed sessions  
<span class="hljs-strong">**Next steps**</span>: Add monitoring on session distribution
</code></pre>
<p>Even this small act helps others understand what happened and why.</p>
<hr />
<h3 id="heading-involve-others-early">Involve Others Early</h3>
<p>Before jumping into a complex feature:</p>
<ol>
<li><p><strong>State the problem</strong>: “We need to support 10× more concurrent users.”</p>
</li>
<li><p><strong>Discuss options</strong>: vertical scaling, horizontal scaling, caching.</p>
</li>
<li><p><strong>Talk trade-offs</strong>: cost, timeline, complexity.</p>
</li>
<li><p><strong>Agree on a path</strong>: and note down why.</p>
</li>
</ol>
<p>This isn’t bureaucracy — it’s avoiding the costlier problem of building the wrong thing quickly.</p>
<hr />
<h3 id="heading-treat-code-reviews-as-learning-time">Treat Code Reviews as Learning Time</h3>
<p>Reviews aren’t just approvals. They’re a chance to share reasoning, context, and lessons.</p>
<p>For your PRs, explain:</p>
<pre><code class="lang-markdown"><span class="hljs-section">## Purpose</span>
Adds rate limiting for API endpoints.

<span class="hljs-section">## Why Redis?</span>
<span class="hljs-bullet">-</span> Works across servers  
<span class="hljs-bullet">-</span> Survives deploys  
<span class="hljs-bullet">-</span> Minimal latency (see attached benchmarks)

<span class="hljs-section">## Risks</span>
<span class="hljs-bullet">-</span> Redis memory usage
<span class="hljs-bullet">-</span> Potential false positives under high load
</code></pre>
<p>For others’ PRs, ask genuine questions. Share links. Suggest patterns, but don’t block unnecessarily.</p>
<hr />
<h3 id="heading-document-as-you-go">Document as You Go</h3>
<p>You don’t need perfect wikis — start with:</p>
<ul>
<li><p>A solid README</p>
</li>
<li><p>A quick architecture sketch</p>
</li>
<li><p>Common troubleshooting steps</p>
</li>
</ul>
<p>The key is to <strong>lower the barrier for the next person</strong>.</p>
<hr />
<h2 id="heading-a-realistic-way-to-start">A Realistic Way to Start</h2>
<p>Shifting this habit isn’t about an overnight transformation. Try this sequence over a few months:</p>
<ul>
<li><p><strong>Week 1</strong>: List systems where only you know how things work.</p>
</li>
<li><p><strong>Weeks 2–4</strong>: Pick one and write a clear setup guide, diagram, or walkthrough.</p>
</li>
<li><p><strong>Weeks 5–8</strong>: For your next big task, bring the team in early.</p>
</li>
<li><p><strong>Weeks 9–12</strong>: Do thoughtful reviews — not just approvals, but knowledge sharing.</p>
</li>
</ul>
<hr />
<h2 id="heading-the-impact-on-your-career">The Impact on Your Career</h2>
<p>Ironically, the moment you stop trying to be “indispensable,” you often become <strong>more valuable</strong>.</p>
<p>Teams appreciate developers who:</p>
<ul>
<li><p>Make others faster</p>
</li>
<li><p>Leave trails others can follow</p>
</li>
<li><p>Build systems that live beyond them</p>
</li>
</ul>
<p>That’s the kind of work that leads to senior roles, tech lead positions, and the ability to influence across multiple projects — not just the one you’re currently “saving.”</p>
<hr />
<h2 id="heading-final-thought">Final Thought</h2>
<p>Being a great developer isn’t just about the code you write. It’s about the <strong>systems you leave behind and the people you help grow</strong>.</p>
<p>Start small: write that README, share that debugging process, ask one more question in your next review. It adds up — for your team and for your career.</p>
<hr />
<p><em>Have you ever had to break the “hero habit”? What was the turning point for you?</em></p>
]]></content:encoded></item><item><title><![CDATA[AWS S3 as an Image Hosting Service]]></title><description><![CDATA[In today's digital world, images are an essential part of online communication. From personal blogs to e-commerce websites, images are used to convey messages, showcase products, and enhance the overall user experience. However, hosting and managing ...]]></description><link>https://blog.madhav.dev/aws-s3-as-an-image-hosting-service</link><guid isPermaLink="true">https://blog.madhav.dev/aws-s3-as-an-image-hosting-service</guid><category><![CDATA[AWS]]></category><category><![CDATA[Amazon S3]]></category><category><![CDATA[#IaC]]></category><category><![CDATA[Devops]]></category><category><![CDATA[hosting]]></category><dc:creator><![CDATA[Madhav Bhasin]]></dc:creator><pubDate>Thu, 16 Feb 2023 20:14:06 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1660057307390/xIcuEcnEV.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In today's digital world, images are an essential part of online communication. From personal blogs to e-commerce websites, images are used to convey messages, showcase products, and enhance the overall user experience. However, hosting and managing images can be a daunting task, especially for websites with a large number of visitors. This is where Amazon S3 comes into play as an excellent image hosting service.</p>
<p>Amazon S3, or Simple Storage Service, is a cloud-based object storage service that allows users to store and retrieve any amount of data from anywhere on the internet. With its high scalability, durability, and security, S3 has become the go-to solution for image hosting, serving as a reliable and cost-effective alternative to traditional hosting services.</p>
<p>Here are some of the benefits of using AWS S3 as an image hosting service:</p>
<ol>
<li><p>Cost-effective: S3 pricing is based on usage, making it a cost-effective solution for image hosting. With S3's pay-as-you-go pricing model, you only pay for the storage and bandwidth you use, with no upfront fees or long-term commitments. This makes S3 an ideal option for websites with varying traffic patterns or those looking to reduce their hosting costs.</p>
</li>
<li><p>High Scalability: S3 allows you to store an unlimited number of images and scales effortlessly to handle any increase in traffic. As your website grows and attracts more visitors, S3 can handle the increased traffic without any downtime or interruptions, making it a scalable solution for image hosting.</p>
</li>
<li><p>Durability: S3 is designed to ensure 99.999999999% durability and 99.99% availability of objects over a given year. This means that your images will always be available to your users, regardless of any hardware failures or network issues.</p>
</li>
<li><p>Security: S3 offers multiple security features to protect your images from unauthorized access, including access control lists, bucket policies, and encryption options. Additionally, S3 integrates with AWS Identity and Access Management (IAM) to provide granular access controls and identity management.</p>
</li>
<li><p>Easy Integration: S3 integrates seamlessly with other AWS services, such as CloudFront, Lambda, and API Gateway, to offer a complete solution for image hosting. This integration allows you to build a scalable and secure image hosting service that can handle any traffic load.</p>
</li>
</ol>
<h3 id="heading-lets-dive-into-some-code">Let's dive into some code</h3>
<p>Here's a Python code example for setting public access permissions on an S3 bucket using the Boto3 library:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> boto3

s3 = boto3.client(<span class="hljs-string">'s3'</span>)

bucket_name = <span class="hljs-string">'your-bucket-name'</span>

response = s3.put_public_access_block(
    Bucket=bucket_name,
    PublicAccessBlockConfiguration={
        <span class="hljs-string">'BlockPublicAcls'</span>: <span class="hljs-literal">False</span>,
        <span class="hljs-string">'IgnorePublicAcls'</span>: <span class="hljs-literal">False</span>,
        <span class="hljs-string">'BlockPublicPolicy'</span>: <span class="hljs-literal">False</span>,
        <span class="hljs-string">'RestrictPublicBuckets'</span>: <span class="hljs-literal">False</span>
    }
)
</code></pre>
<p>As a developer working for a website that hosts images, this is just one of the steps in setting up the whole flow. Being a developer, I have also faced the issue of manually doing the whole flow.</p>
<p>Here comes <strong>automation</strong> (IaC). Infrastructure as code (IaC) refers to the process of managing and provisioning infrastructure through code and automation tools. AWS CloudFormation is an IaC service that enables users to create and manage AWS resources using templates and automation.</p>
<p><a target="_blank" href="https://gitlab.com/madhav.bhasin/aws-resources/-/blob/main/cloud-formation-templates/s3-public.yaml">Here</a> is a CloudFormation template that setups the whole flow for us. The template does the following things</p>
<ol>
<li><p>An S3 bucket with public access</p>
</li>
<li><p>Attach the new S3 bucket with a Cloudfront CDN for faster access and caching</p>
</li>
<li><p>Create an IAM user with full access to the newly created S3 bucket</p>
</li>
</ol>
<p>This template solves the burden of setting up this whole flow manually. You just need to run the following from your terminal or upload the template in your AWS console</p>
<pre><code class="lang-bash">aws cloudformation deploy --stack-name [STACK_NAME] --parameter-overrides Stage=<span class="hljs-string">"[ENVIRONMENT]"</span> BucketName=<span class="hljs-string">"[BUCKET_NAME]"</span> --template-file [PATH_TO_TEMPLATE] --region [REGION] --profile [PROFILE_NAME] --capabilities CAPABILITY_NAMED_IAM
</code></pre>
<p>In conclusion, AWS S3 is a reliable and cost-effective image hosting solution that offers a range of benefits, including high scalability, durability, security, and easy integration with other AWS services. If you're looking for an image hosting service that can handle your website's traffic and provide a seamless user experience, then AWS S3 is the perfect solution.</p>
<p><strong>NOTE: I am a developer too. The template can have its fault. Feel free to comment or contact me directly through website/LinkedIn</strong></p>
<p>Hope this was helpful. Thanks</p>
<p>Website: <a target="_blank" href="https://madhav.dev">Madhav Bhasin</a></p>
<p>Github: <a target="_blank" href="https://github.com/manbhasin">Madhav Bhasin</a></p>
<p>Linkedin: <a target="_blank" href="https://www.linkedin.com/in/manbhasin/">Madhav Bhasin</a></p>
]]></content:encoded></item><item><title><![CDATA[Managing Multiple SSH Keys]]></title><description><![CDATA[SSH is one of the most used protocols for safe data exchange. SSH keys can serve as a means of identifying yourself to an SSH server using public-key cryptography and challenge-response authentication. 

Working with Single SSH Keypair

It's quite ea...]]></description><link>https://blog.madhav.dev/managing-multiple-ssh-keys</link><guid isPermaLink="true">https://blog.madhav.dev/managing-multiple-ssh-keys</guid><category><![CDATA[Developer]]></category><category><![CDATA[ssh]]></category><dc:creator><![CDATA[Madhav Bhasin]]></dc:creator><pubDate>Fri, 11 Mar 2022 12:39:55 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/unsplash/7tkDoo2L_Eg/upload/v1646970100901/ias5TXTdw.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>SSH is one of the most used protocols for safe data exchange. SSH keys can serve as a means of identifying yourself to an SSH server using <a target="_blank" href="https://en.wikipedia.org/wiki/Public-key_cryptography">public-key cryptography</a> and <a target="_blank" href="https://en.wikipedia.org/wiki/Challenge%E2%80%93response_authentication">challenge-response authentication</a>. </strong></p>
<ul>
<li><strong>Working with Single SSH Keypair</strong></li>
</ul>
<p>It's quite easy working with a single SSH key pair</p>
<ol>
<li><p><strong>Create a new key pair </strong></p>
<pre><code> ssh<span class="hljs-operator">-</span>keygen <span class="hljs-operator">-</span>t ed25519 <span class="hljs-operator">-</span>C <span class="hljs-string">"your_email@example.com"</span>
 or
 ssh<span class="hljs-operator">-</span>keygen <span class="hljs-operator">-</span>t rsa <span class="hljs-operator">-</span>b <span class="hljs-number">4096</span> <span class="hljs-operator">-</span>C <span class="hljs-string">"your_email@example.com"</span>

 NOTE: Replace <span class="hljs-string">"your_email@example.com"</span> with your own email
</code></pre><p> Thinking what's ED25519 and RSA are? Do have a look at this <a target="_blank" href="https://security.stackexchange.com/questions/90077/ssh-key-ed25519-vs-rsa">SSH Key ED25519 vs RSA</a></p>
<p> <strong>Use ED25519, it's more secure and faster</strong></p>
<p> After hitting one of the above commands you will be prompt to give a file name or use
 the default one. Personal preference you should give a new file name and not use the 
 default one. <strong>NOTE: If you are going to have a new file name then you need to pass 
 the full path, not just the file name e.g <code>/Users/&lt;USERNAME&gt;/.ssh/my-new-ssh- 
 key</code></strong></p>
<p>Hit Enter, give a new passphrase/password to your key, and <strong>TADA: It's done</strong></p>
</li>
<li><p><strong>Copy your public key and paste it on the server/repo</strong></p>
<pre><code> on macOS
 pbcopy <span class="hljs-operator">&lt;</span> <span class="hljs-operator">~</span><span class="hljs-operator">/</span>.ssh/[SSH_KEY_NAME].pub

 on linux 
 cat <span class="hljs-operator">~</span><span class="hljs-operator">/</span>.ssh/[SSH_KEY_NAME].pub
</code></pre></li>
<li><p><strong>Test your SSH connection </strong>
 e.g I have generated my ssh key pair for my Github account then </p>
<pre><code>  <span class="hljs-selector-tag">ssh</span> <span class="hljs-selector-tag">github</span><span class="hljs-selector-class">.com</span>
</code></pre><p>  If everything is good then you will be seeing something like this </p>
<pre><code> Hi <span class="hljs-tag">&lt;<span class="hljs-name">USERNAME</span>&gt;</span>! You've successfully authenticated, but GitHub does not provide 
 shell access.
</code></pre></li>
</ol>
<ul>
<li><p><strong>Working with Multiple SSH Keypair</strong></p>
<p> Managing SSH keys can become cumbersome as soon as you need to use a second 
 key pair. You might be using one SSH key pair for working on your company’s internal 
 projects but you might be using a different key for accessing some corporate client’s 
 servers. We can have more such cases where we need to have multiple SSH key 
 pairs.</p>
</li>
</ul>
<ol>
<li><p><strong>Create another SSH key pair,  follow the same steps as above</strong></p>
<p> When you test your connection you will see something like this </p>
<pre><code> <span class="hljs-keyword">connect</span> <span class="hljs-keyword">to</span> &lt;<span class="hljs-keyword">server</span>&gt; host : <span class="hljs-keyword">Connection</span> Refused
</code></pre><p> Now, what's happened here. I am taking Github as an example here. You have 2 
 Github accounts and you have 2 different SSH public keys attached to them (Github 
 doesn't allow to have the same SSH keys for 2 different accounts).  </p>
<p> Your Github account has an SSH public key and it's expecting the respective private 
 key on your local machine. But it's not taking that, it's taking the default one because
 you have the same hostname <code>github.com</code> as the previous one. Here comes the 
 <strong>SSH Config</strong></p>
</li>
<li><p><strong>SSH Config</strong></p>
<p>  SSH allows you to set up a per-user configuration file where you can store different 
  SSH options for each remote machine you connect to.
  By default, the SSH configuration file may not exist, so you may need to create it </p>
<pre><code>  touch <span class="hljs-operator">~</span><span class="hljs-operator">/</span>.ssh/config
</code></pre><p>  This file must be readable and writable only by the user and not accessible by 
  others </p>
<pre><code>  chmod <span class="hljs-number">600</span> <span class="hljs-operator">~</span><span class="hljs-operator">/</span>.ssh/config
</code></pre><p>   SSH Config File Example</p>
<pre><code> Host github.com-targaryen
      HostName github.com
      User git
      IdentityFile <span class="hljs-operator">~</span><span class="hljs-operator">/</span>.ssh/targaryen
</code></pre><p> Here is what's going on : </p>
<ol>
<li>We have defined a Host/server for which we want to specify some rules</li>
<li><p>Under that host we have defined some rules like the hostname, server user, and a 
 IdentityFile (private key file)</p>
<p>When a user (git) tries to connect to a host (github.com-targaryen), the SSH Agent 
will use the specified IdentityFile and not the default one.</p>
<p><strong>Solution for our current user case</strong></p>
<pre><code>Host github.com-githubAccount1
     HostName github.com
     User git
     IdentityFile <span class="hljs-operator">~</span><span class="hljs-operator">/</span>.ssh/<span class="hljs-operator">&lt;</span>FILE_NAME_1<span class="hljs-operator">&gt;</span>

Host github.com-githubAccount2  
     HostName github.com
     User git
     IdentityFile <span class="hljs-operator">~</span><span class="hljs-operator">/</span>.ssh/<span class="hljs-operator">&lt;</span>FILE_NAME_2<span class="hljs-operator">&gt;</span>
</code></pre><p> NOTE: Here the FILE_NAME should be the respective private key file name. Also, 
 you can change the hostname <code>github.com-githubAccount1</code> to anything but you 
 have to keep <code>github.com-</code></p>
</li>
</ol>
</li>
</ol>
<p><strong>IMPORTANT</strong></p>
<p>   When cloning/adding the remote to your git repository make sure you do this step :</p>
<p>   Change the ssh clone URL a bit : </p>
<pre><code>    Original
    git@github.com:<span class="hljs-operator">&lt;</span>user<span class="hljs-operator">&gt;</span><span class="hljs-operator">/</span><span class="hljs-operator">&lt;</span>repo<span class="hljs-operator">&gt;</span>.git

    Changed
    git@github.com-githubAccount1:<span class="hljs-operator">&lt;</span>user<span class="hljs-operator">&gt;</span><span class="hljs-operator">/</span><span class="hljs-operator">&lt;</span>repo<span class="hljs-operator">&gt;</span>.git
</code></pre><p><strong> Noticed what changed? I have added a unique identifier after <code>github.com</code>. It's the 
 same identifier that you added in the hostname while editing the SSH config file. It 
 should be the same. </strong></p>
<p>Hope this was helpful. Thanks</p>
<p>Website : <a target="_blank" href="https://madhav.dev">Madhav Bhasin</a></p>
<p>Github : <a target="_blank" href="https://github.com/manbhasin">Madhav Bhasin</a></p>
<p>Linkedin : <a target="_blank" href="https://www.linkedin.com/in/manbhasin/">Madhav Bhasin</a></p>
<p>NOTE: I would appreciate any comments if I have missed anything.</p>
]]></content:encoded></item></channel></rss>