Distributed System Refreshers :
Introduction
When building real-time or near-real-time systems, you need a mechanism to notify your application of new events or data changes. Three common approaches are Long Polling, Webhooks and WebSocket. All have distinct use cases, advantages, and limitations.
How Does Polling Work?
Polling is a technique where a client repeatedly makes HTTP requests to a server at regular intervals to check for new data.
Characteristics of Polling:
Simple Implementation: Clients periodically ask the server for updates.
Latency: Updates are delayed by the interval between requests.
Resource Intensive: Can generate a significant amount of unnecessary traffic.
Long Polling
Long Polling is an improvement over basic polling. Instead of the server immediately responding to a client's request, it keeps the connection open until there is new data or a timeout occurs.
How It Works:
Client sends an HTTP request to the server.
The server holds the request until there’s data or a timeout.
Once data is available (or timeout occurs), the server responds.
The client processes the data and immediately sends a new request.
Advantages:
Lower Latency: Data is sent as soon as it’s available.
Reduced Redundancy: Avoids frequent requests when there’s no data to send.
Disadvantages:
Server Load: Requires the server to maintain open connections.
Scalability Issues: Handling many simultaneous connections can be challenging.
What is WebHook?
Webhooks are server-to-server communication mechanisms. The client registers a callback URL with the server, and the server sends HTTP POST requests to that URL whenever there’s new data.
How It Works:
The client subscribes to events by providing a callback URL.
When an event occurs, the server sends an HTTP POST request with the data to the client’s URL.
The client processes the data immediately.
Advantages:
Efficient: Eliminates the need for repeated requests by the client.
Real-Time Updates: Events are pushed as they happen.
Disadvantages:
Setup Complexity: Requires hosting and securing a public-facing endpoint.
Reliability Concerns: If the client’s server is down, data may be lost (though retries can mitigate this).
Choosing webhooks or the REST API
Using webhooks has the following advantages over using the API:
Webhooks require less effort and less resources than polling an API.
Webhooks scale better than API calls. If you need to monitor many resources, calling the API for each resource may cause you to hit your API rate limit quota quickly. Instead, you can subscribe to multiple webhook events and receive information only when an event happens.
Webhooks allow near real-time updates, since webhooks are triggered when an event happens.
What is WebSockets
Web sockets enable real-time, bidirectional communication between clients and servers, making them ideal for applications requiring instant data updates.
Websockets
Websockets are (usually) for server to browser communication. The server hosts a websocket server, and clients can open a connection to that server. This is popular now mostly because it is faster and less resource-hogging than older ways of solving the problem.
Advantages:
Low latency and reduced overhead for high-frequency communication.
Maintains a persistent connection, eliminating the need for reconnection.
Limitations:
Requires more sophisticated server infrastructure.
May face compatibility issues with some firewalls or proxies.
Can increase resource usage on the server due to persistent connections.
Highlights
Real-time communication: Web sockets facilitate persistent two-way connections, unlike traditional APIs that only allow one-way data flow.
Protocol definition: They operate on a set of rules for data formatting and transmission between devices.
Indefinite connections: Web sockets maintain long-lived connections, eliminating the need for continuous polling.
Instant updates: Ideal for applications like chat, online gaming, and sports betting, providing immediate data synchronization.
Connection management: They include mechanisms for handling dropped connections and inactivity.
Comparison
Here’s a guide to help you decide:
Use Long Polling If:
You don’t have control over the server's architecture to implement Webhooks.
The client-side is easier to manage than setting up and maintaining a public-facing endpoint.
The number of clients is relatively small, or you have sufficient server resources.
Use Webhooks If:
You need real-time updates without constantly pinging the server.
You can host and secure a publicly accessible callback URL.
Scalability and efficient resource usage are critical for your application.
Use WebSocket If :
Real-Time Bidirectional Communication: When both client and server need to send updates frequently.
Example: Online multiplayer games, live chat applications, stock market dashboards.
Low Latency Requirements: When immediate responses are critical.
Example: Collaborative tools like Google Docs or Figma.
High-Frequency Updates: When the client requires continuous updates from the server.
Summary
Use Long Polling for occasional real-time updates with simple infrastructure.
Use Webhooks for decoupled, event-driven notifications.
Use WebSockets for interactive applications requiring low-latency, real-time communication.