Design Real-Time Stock Price Feed
Overview
This article explains the process of designing a real-time stock price feed for investment platforms and is targeted at developers who intend to design a trading platform for their users. In this technical post, we will discuss designing a low-latency price feed. Also, we will consider a few design choices and analyze their pros and cons.
This entire design will be divided into a series comprised of four articles. In the first article, we will be discussing the different design choices and the high-level design of the system. In the second and third articles, we will delve into the details of the implementation and the last one would be combining all of it.
Exploring Different Design Choices
Since our entire architecture is event-driven, we would be needing message queues to route the data to different microservices for different operations. While choosing a message broker for executing our asynchronous operations, we will consider these things:
- Broker Scale — The number of messages being processed per second in the system.
- Data Persistency — The ability to recover messages in case of abrupt failures.
- Consumer Capability — Whether the broker is capable of managing one-to-one and/or one-to-many consumers.
- Ordering of Processing of Messages — Whether the order in which the messages are queued is the same as the order in which they are processed by the subscribers.
Scale
For our use case, let’s crunch some numbers. Assuming we are creating a price feed for 1,000 stocks. And, we are getting 1 price tick every second (considering Standard Price Feed APIs like Binance, Bloomberg, etc.) for each of these stocks. This sums up our complete scale to a total of 1,000 messages per second. The scale of overall messages is so low, that it can not be a differentiating factor because Kafka and Redis Pub/Sub have the capability of processing a million messages per second whereas RabbitMQ can process 50k messages per second. But for our use cases, all of the message brokers would pass on this metric.
Persistency
Since our messages are short-lived, we don’t need data persistence in our use case. Even if we miss a few price ticks while sending it to our users, It won’t impact our system. In simple terms, the messages won’t be lost even if the subscribers are down.
Consumer Capability
Since we will be sending price updates to multiple consumers for different operations. We would need a message broker which can manage one-to-many consumers.
Ordering of Processing of Messages
Some Message Brokers work on the fire and forget or broadcasting strategy and don’t wait for the acknowledgement from the subscribers before pushing the new message to the subscriber.
NOTE — If the processing of the message is asynchronous, the order of processing of the message might not be the same as the order in which they were queued.
Redis Pub/Sub is a perfect fit for our use case when we want to send price ticks to our user in minimum latency since the messages are short-lived and few ticks loss is tolerable. Furthermore, there won’t be any obligation of sending stale prices to clients, therefore persistence is not needed. It provides extremely fast service and in-memory capabilities, which reduces the latency of the price updates.
But Redis Pub/Sub fails to guarantee the ordering of the processing of messages by the subscribers since it works on a broadcasting model. Therefore, we might need some Kafka or RabbitMQ for services where the ordering of processing of messages is required like a service that will be responsible for updating the candlesticks. So in our final architecture, we will use a combination of RabbitMQ and Redis Pub/Sub.
There could be three different ways of serving our real-time stock price feed from our api-server to our clients. These are as follows -
Polling
In polling, a client makes XHR requests to the server repeatedly at some regular interval (e.g 0.5 Seconds) to check for new prices.
Pros
- Easy to implement.
Cons
- Increases Server Load because of repeated requests.
- Increases Client resource utilization because of repeated requests.
WebSockets
WebSocket is an entirely different 7 layer communication protocol that provides full-duplex communication channels over a single TCP connection. The WebSocket protocol allows the communication between a client and a web server with lesser overheads (no need of creating and destroying connections every time), providing real-time data transfer from and to the server. WebSockets keep the connection open, therefore allowing data to be passed back and forth between the client and the server.
Pros
- The bi-directional channel allows full-duplex communication.
- Provides Low Latency Data Communication.
- Native support in more browsers.
- Supports binary as well as utf-8 data transmission.
Cons
- A maximum number of open connections limit (255 in Chrome and 200 in Firefox).
- Proxy Servers can wreak havoc with WebSockets running over unsecured HTTP.
- It can be overkill for some types of applications, where the client doesn’t need to send data to a server in real-time.
Server-Sent Events
Server-Sent Events (SSE) is a server push technology in which a client establishes a long persistent connection with the server, therefore allowing a server to automatically send new events/messages to a client(whenever available) via an HTTP connection. They are commonly used to broadcast message updates or send continuous data streams to a client e.g. Twitter feed, Instagram feed, or a stock price feed. SSE uses Javascript API called EventSource to continuously update a client.
Pros
- Uses HTTP instead of a custom protocol
- Built-in support for re-connection and event-id
- No trouble with corporate firewalls (Corporate Proxies) doing packet inspection.
Cons
- Supports only utf-8 data transmission.
- A maximum number of open connections limit (Very low for some clients).
Because of the simplex communication nature of SSE, it fits right into our use case. Since we will just be broadcasting new price updates to all the connected clients, we won’t need any information from the clients to send these price updates, so WebSockets would be an overkill. Also, SSE is very easy to implement with native APIs of Node.js. Now that we have considered different design choices, we should dig deeper into the high-level design of our entire system and discuss each of those components separately.
Exchange or External Price Source
The service will be creating the price ticks of a stock. It could either be an internal exchange service or an external source of price like Binance, FinHubb, etc. This service could provide us with a new price by sending the new prices using web sockets or SSE (Push Mechanism) or we could fetch prices from them at regular intervals (Pull Mechanism).
Price Consumer Service
This service consumes the price ticks from the Exchange or External Price Source and passes them into the system. This could be the right place for adding some validations on price ticks before allowing them to flow into the system. You could also synthesize your synthetic price feed using the raw price feed by adding some safety spreads or currency conversions. This service then publishes the new prices on the price_feed:${stock}
Redis channel and the price-ticks
fanout exchange. The exchange will simply broadcast the prices to all the subscribed queues. You can either run one instance of price-consumer-service
for each stock or combine a group of multiple stocks for one instance or a combination of both strategies. It depends on your use case.
API Server
The service provides event stream API (SSE) and other REST APIs to clients. Each instance of this service will subscribe to the Redis price_feed:{stock}
channel to get price updates in real-time. Whenever there is a new price update, each api-server
instance is notified about the new price via events. On receiving this event, the api-server
then pushes the price update event on the SSE for the clients to receive the real-time price updates.
Stock Price Service
The service will subscribe to the save-price-ticks
message queue and consume all the messages and save them in a time-series database like Timescale or InfluxDB for analytics. A separate message queue is added at the start of this service because we need persistence in case of a failure to avoid losing price ticks for analytics.
Price Graph Service
This service will subscribe to the update-candle-sticks
message queue and updates the different candles in the cache as well as the primary read database in your system. A separate message queue is added at the start of this service to ensure the correct ordering in which the messages are processed by using acknowledgements.
Further Reading