Back to issues
NETWORKING

WebSockets in practice: from handshake to production deploy

Complete guide to WebSockets: how they work, implementation with Socket.IO and ws, scaling with Redis, and when to actually use them.

By Thiago Saraiva38 MIN

66a36bae298c0580eb0b6af150085a0aa66a36bae298c0580eb0b6af150085a0aa

Fundamentals

What are WebSockets?

WebSockets (RFC 6455) enable real-time communication with low latency. Unlike traditional HTTP (request/response), WebSockets maintain an open connection where both sides can send data at any time.

Technology Comparison

<table> <tr> <td>Feature</td> <td>WebSocket</td> <td>HTTP/REST</td> <td>SSE</td> <td>Long Polling</td> </tr> <tr> <td>Direction</td> <td>Bidirectional</td> <td>Unidirectional</td> <td>Server→Client</td> <td>Client→Server</td> </tr> <tr> <td>Latency</td> <td>\< 1ms</td> <td>50-200ms</td> <td>\~5ms</td> <td>100-500ms</td> </tr> <tr> <td>Overhead</td> <td>2 bytes/frame</td> <td>\~800 bytes</td> <td>\~50 bytes</td> <td>\~800 bytes</td> </tr> <tr> <td>Protocol</td> <td>ws:// wss://</td> <td>http:// https://</td> <td>http:// https://</td> <td>http:// https://</td> </tr> <tr> <td>Browser Support</td> <td>98%+</td> <td>100%</td> <td>95%+</td> <td>100%</td> </tr> <tr> <td>Reconnection</td> <td>Manual</td> <td>N/A</td> <td>Automatic</td> <td>Manual</td> </tr> </table> ## Handshake and Upgrade WebSocket starts as a normal HTTP request and upgrades to the WS protocol:

Server response:

PostgreSQL LISTEN/NOTIFY

PostgreSQL offers native pub/sub that can trigger WebSocket events:

Node.js integration:

Redis Pub/Sub as Backbone

Redis is ideal for distributing events across multiple WebSocket server instances:

MongoDB Change Streams

MongoDB 4.0+ change streams let you react to real-time data changes:


Backend Layer

Node.js — ws (Raw WebSocket)

The ws library is the fastest and lightest option:

Socket.IO Server

Socket.IO adds rooms, broadcasts, and automatic transport fallback:

Python FastAPI WebSocket

Go gorilla/websocket

JWT Authentication at Handshake

Pattern: Rooms and Channels


Frontend Layer

Native WebSocket API

Reconnection with Exponential Backoff

Socket.IO Client

React Hook useWebSocket

Zustand Integration

Binary Data (ArrayBuffer/Blob)


Scaling and Infrastructure

Sticky Sessions

Load balancers must route the same client to the same server: Nginx:

Redis Adapter for Socket.IO

Enables broadcast across multiple instances with one line:

Horizontal Scaling with Pub/Sub

Architecture for multiple WS servers — each instance subscribes to a shared Redis channel:

AWS ALB WebSocket Support

Key CloudFormation settings for WebSocket-compatible ALB:

  • Enable stickiness (lb_cookie) on the Target Group so sessions stay pinned to one instance.
  • Set deregistration_delay low (e.g. 30s) to allow graceful draining.
  • The ALB listener forwards traffic to the target group; no special WebSocket protocol flag is needed — ALB handles the upgrade transparently.

API Gateway WebSocket (Serverless)

AWS API Gateway WebSocket routes each event to a Lambda function. Connection IDs are stored in DynamoDB; to push messages back you call ApiGatewayManagementApi.postToConnection.

<callout icon="💰" color="yellow_bg"> **Cost**: API Gateway WebSocket costs \$1.00 per million messages + \$0.25 per million connection minutes. </callout> --- # Real-World Use Cases ## Real-time Chat ```javascript // Backend class ChatRoom { constructor(roomId) { this.roomId = roomId; this.clients = new Map(); // userId -> {ws, username} this.messages = []; // In-memory cache (last 100) this.typingUsers = new Set(); }

join(userId, username, ws) { this.clients.set(userId, { ws, username }); ws.send(JSON.stringify({ type: 'HISTORY', messages: this.messages.slice(-50) })); this.broadcast({ type: 'USER_JOINED', userId, username, userCount: this.clients.size }, userId); }

leave(userId) { const client = this.clients.get(userId); if (!client) return; this.clients.delete(userId); this.typingUsers.delete(userId); this.broadcast({ type: 'USER_LEFT', userId, username: client.username, userCount: this.clients.size }); }

sendMessage(userId, text) { const { username } = this.clients.get(userId) || {}; if (!username) return; const message = { id: Date.now() + Math.random(), userId, username, text, timestamp: Date.now() }; this.messages.push(message); if (this.messages.length > 100) this.messages.shift(); this.broadcast({ type: 'MESSAGE', message }); db.query('INSERT INTO messages (room_id, user_id, text, created_at) VALUES ($1,$2,$3,$4)', [this.roomId, userId, text, new Date()]); }

setTyping(userId, isTyping) { isTyping ? this.typingUsers.add(userId) : this.typingUsers.delete(userId); this.broadcast({ type: 'TYPING', userId, username: this.clients.get(userId)?.username, isTyping }, userId); }

broadcast(data, excludeUserId = null) { this.clients.forEach((client, uid) => { if (uid !== excludeUserId && client.ws.readyState === WebSocket.OPEN) client.ws.send(JSON.stringify(data)); }); } }

// Frontend React function ChatRoom({ roomId, userId, username }) { const [messages, setMessages] = useState([]); const [typingUsers, setTypingUsers] = useState([]); const [inputText, setInputText] = useState(''); const wsRef = useRef(null); const typingTimeout = useRef(null);

useEffect(() => { const ws = new WebSocket('ws://localhost:8080'); wsRef.current = ws; ws.onopen = () => ws.send(JSON.stringify({ type: 'JOIN', roomId, userId, username })); ws.onmessage = (e) => { const data = JSON.parse(e.data); if (data.type === 'HISTORY') setMessages(data.messages); else if (data.type === 'MESSAGE') setMessages(prev => [...prev, data.message]); else if (data.type === 'TYPING') setTypingUsers(prev => data.isTyping ? [...prev, data.username] : prev.filter(u => u !== data.username) ); }; return () => ws.close(); }, [roomId, userId, username]);

const handleInputChange = (e) => { setInputText(e.target.value); wsRef.current?.send(JSON.stringify({ type: 'TYPING', isTyping: true })); clearTimeout(typingTimeout.current); typingTimeout.current = setTimeout(() => wsRef.current?.send(JSON.stringify({ type: 'TYPING', isTyping: false })), 3000); };

const handleSend = () => { if (inputText.trim()) { wsRef.current?.send(JSON.stringify({ type: 'MESSAGE', text: inputText })); setInputText(''); } };

return ( <div className="chat-room"> <div className="messages"> {messages.map(msg => <div key={msg.id}><strong>{msg.username}:</strong> {msg.text}</div>)} </div> {typingUsers.length > 0 && <div className="typing-indicator">{typingUsers.join(', ')} is typing...</div>} <input value={inputText} onChange={handleInputChange} onKeyPress={(e) => e.key === 'Enter' && handleSend()} /> <button onClick={handleSend}>Send</button> </div> ); }

## Collaborative Editing (OT Basics)
```javascript
// Backend - Operational Transformation (simplified)
class CollaborativeDoc {
  constructor() {
    this.content = '';
    this.version = 0;
    this.clients = new Map();
  }

  applyOperation(op) {
    // Op: { type: 'insert'|'delete', position, text?, length? }
    if (op.type === 'insert') {
      this.content = this.content.slice(0, op.position) + op.text + this.content.slice(op.position);
    } else if (op.type === 'delete') {
      this.content = this.content.slice(0, op.position) + this.content.slice(op.position + op.length);
    }
    this.version++;
    this.broadcast(op);
  }

  transform(op1, op2) {
    // Transform op1 assuming op2 was applied first
    if (op1.type === 'insert' && op2.type === 'insert') {
      return op1.position < op2.position ? op1 : { ...op1, position: op1.position + op2.text.length };
    }
    return op1; // In practice, use ShareDB or Yjs
  }

  broadcast(operation, excludeClientId) {
    this.clients.forEach((ws, clientId) => {
      if (clientId !== excludeClientId) {
        ws.send(JSON.stringify({ type: 'OPERATION', operation, version: this.version }));
      }
    });
  }
}

// Frontend
class CollaborativeEditor {
  constructor(docId, ws) {
    this.docId = docId;
    this.ws = ws;
    this.content = '';
    this.version = 0;
    this.pendingOps = [];
    this.editor = document.getElementById('editor');

    this.editor.addEventListener('input', () => {
      const cursorPos = this.editor.selectionStart;
      const newContent = this.editor.value;
      const diff = newContent.length - this.content.length;
      const op = diff > 0
        ? { type: 'insert', position: cursorPos - diff, text: newContent.slice(cursorPos - diff, cursorPos) }
        : { type: 'delete', position: cursorPos, length: -diff };
      this.content = newContent;
      this.pendingOps.push(op);
      this.ws.send(JSON.stringify({ type: 'OPERATION', docId: this.docId, operation: op, version: this.version }));
    });

    this.ws.onmessage = (event) => {
      const data = JSON.parse(event.data);
      if (data.type !== 'OPERATION') return;
      // Transform incoming op against all pending ops, then apply
      let op = data.operation;
      for (const pending of this.pendingOps) op = this.transform(op, pending);
      const cursor = this.editor.selectionStart;
      if (op.type === 'insert') this.content = this.content.slice(0, op.position) + op.text + this.content.slice(op.position);
      else if (op.type === 'delete') this.content = this.content.slice(0, op.position) + this.content.slice(op.position + op.length);
      this.editor.value = this.content;
      this.editor.setSelectionRange(cursor, cursor);
      this.version++;
    };
  }

  transform(op1, op2) { return op1; } // simplified
}

Live Notifications

Multiplayer Game State Sync


Performance

Performance Comparison

<table> <tr> <td>Metric</td> <td>WebSocket</td> <td>HTTP Long Polling</td> <td>SSE</td> </tr> <tr> <td>Average latency</td> <td>1-5ms</td> <td>100-500ms</td> <td>5-20ms</td> </tr> <tr> <td>Overhead per message</td> <td>2-6 bytes</td> <td>\~800 bytes</td> <td>\~50 bytes</td> </tr> <tr> <td>Throughput (msg/s)</td> <td>10,000+</td> <td>\~10</td> <td>\~100</td> </tr> <tr> <td>Simultaneous connections</td> <td>100,000+</td> <td>\~1,000</td> <td>\~10,000</td> </tr> <tr> <td>CPU usage (10k conn)</td> <td>\~5%</td> <td>\~40%</td> <td>\~15%</td> </tr> <tr> <td>Memory per connection</td> <td>\~8KB</td> <td>\~50KB</td> <td>\~15KB</td> </tr> <tr> <td>Battery drain (mobile)</td> <td>Low</td> <td>High</td> <td>Medium</td> </tr> </table> ## Memory per Connection <table> <tr> <td>Library</td> <td>Base memory</td> <td>Per connection</td> <td>10k connections</td> </tr> <tr> <td>ws (Node.js)</td> <td>\~50MB</td> <td>\~8KB</td> <td>\~130MB</td> </tr> <tr> <td>Socket.IO</td> <td>\~80MB</td> <td>\~15KB</td> <td>\~230MB</td> </tr> <tr> <td>uWebSockets.js</td> <td>\~20MB</td> <td>\~3KB</td> <td>\~50MB</td> </tr> <tr> <td>gorilla/websocket (Go)</td> <td>\~10MB</td> <td>\~4KB</td> <td>\~50MB</td> </tr> <tr> <td>FastAPI (Python)</td> <td>\~40MB</td> <td>\~20KB</td> <td>\~240MB</td> </tr> </table> ## Max Connections per Server <columns> <column> **Node.js (ws):** Single-core ~10k, multi-core ~100k, cluster ~1M. Limits: file descriptors (`ulimit -n`), RAM, bandwidth. </column> <column> **Go (gorilla):** Single-core ~50k, multi-core ~500k, with load balancer ~5M. Tune `GOMAXPROCS` and `GOMEMLIMIT`. </column> </columns> ## Benchmark: Messages per Second Use `wscat` or `ws-bench` for load testing. Typical results: <table> <tr> <td>Scenario</td> <td>Connections</td> <td>Messages/s</td> <td>Latency p99</td> </tr> <tr> <td>Local (ws)</td> <td>100</td> <td>50,000</td> <td>2ms</td> </tr> <tr> <td>Local (Socket.IO)</td> <td>100</td> <td>30,000</td> <td>5ms</td> </tr> <tr> <td>LAN</td> <td>1,000</td> <td>100,000</td> <td>10ms</td> </tr> <tr> <td>Internet (US-East)</td> <td>10,000</td> <td>200,000</td> <td>50ms</td> </tr> <tr> <td>Global (multi-region)</td> <td>100,000</td> <td>500,000</td> <td>200ms</td> </tr> </table> --- # Security ## Origin Validation ```javascript const WebSocket = require('ws');

const wss = new WebSocket.Server({ verifyClient: (info, callback) => { const origin = info.origin || info.req.headers.origin; const allowedOrigins = [ 'https://example.com', 'https://app.example.com' ];

if (allowedOrigins.includes(origin)) {
  callback(true);
} else {
  console.log(`Rejected origin: ${origin}`);
  callback(false, 403, 'Origin not allowed');
}

} });

## Rate Limiting
```javascript
const rateLimit = new Map();

function checkRateLimit(id, limit = 10, windowMs = 1000) {
  const now = Date.now();
  const d = rateLimit.get(id) || { count: 0, resetAt: now + windowMs };
  if (now > d.resetAt) { d.count = 0; d.resetAt = now + windowMs; }
  d.count++;
  rateLimit.set(id, d);
  return d.count <= limit;
}

wss.on('connection', (ws, req) => {
  const ip = req.socket.remoteAddress;
  ws.on('message', (data) => {
    if (!checkRateLimit(ip, 10, 1000)) { ws.send(JSON.stringify({ error: 'Rate limit exceeded' })); return; }
    // process...
  });
});

DoS Protection

WSS (TLS)

Always use wss:// in production. Two options:

Option 1 — TLS directly in Node.js:

Option 2 — Nginx as WSS proxy (recommended):


Decision Framework

When to use WebSocket?

Use WebSocket when:

  • Real-time chat
  • Multiplayer games
  • Collaborative editing
  • Trading / financial dashboards
  • Audio/video streaming
  • IoT (device to server bidirectional)

Avoid WebSocket when:

  • Only server to client communication (use SSE instead)
  • Infrequent updates (use polling)
  • Public API (hard to cache)
  • Infrastructure without support (corporate firewalls)

Decision Tree

Detailed Comparison

<table> <tr> <td>Criterion</td> <td>WebSocket</td> <td>SSE</td> <td>Long Polling</td> <td>Polling</td> </tr> <tr> <td>Bidirectional</td> <td>Yes</td> <td>No</td> <td>No</td> <td>No</td> </tr> <tr> <td>Real-time</td> <td>Excellent</td> <td>Good</td> <td>Fair</td> <td>No</td> </tr> <tr> <td>Efficiency</td> <td>Excellent</td> <td>Good</td> <td>Poor</td> <td>Poor</td> </tr> <tr> <td>Simplicity</td> <td>Low</td> <td>High</td> <td>Medium</td> <td>Very High</td> </tr> <tr> <td>HTTP/2 support</td> <td>No</td> <td>Yes</td> <td>Yes</td> <td>Yes</td> </tr> <tr> <td>CDN caching</td> <td>No</td> <td>No</td> <td>Partial</td> <td>Yes</td> </tr> <tr> <td>Mobile friendly</td> <td>Good</td> <td>Good</td> <td>Poor</td> <td>Poor</td> </tr> <tr> <td>Firewall friendly</td> <td>No</td> <td>Yes</td> <td>Yes</td> <td>Yes</td> </tr> </table> ## Examples by Use Case <columns> <column> **WebSocket** - Slack, Discord - Google Docs - Figma, Miro - Binance, trading apps - Among Us, .io games </column> <column> **SSE** - Twitter feed updates - Stock tickers (unidirectional) - News feeds - Server log streaming - Progress bars (builds, uploads) </column> <column> **Long Polling** - Facebook notifications (fallback) - WhatsApp Web (fallback) - Legacy systems </column> <column> **Polling** - Email inbox refresh - Dashboard metrics (1/min) - Status checks </column> </columns> ## Cost Comparison <table> <tr> <td>Solution</td> <td>Infrastructure</td> <td>Bandwidth</td> <td>Complexity</td> </tr> <tr> <td>WebSocket (self-hosted)</td> <td>$$</td> <td>$</td> <td>$$$</td> </tr> <tr> <td>AWS API Gateway WS</td> <td>$$$</td> <td>$$</td> <td>$</td> </tr> <tr> <td>Pusher/Ably</td> <td>$$$$</td> <td>$$$</td> <td>$</td> </tr> <tr> <td>SSE (self-hosted)</td> <td>$</td> <td>$</td> <td>$</td> </tr> <tr> <td>Long Polling</td> <td>$$$</td> <td>$$$</td> <td>$$</td> </tr> </table>