Leveraging Web sockets for powering chat platforms

Image source: https://res.cloudinary.com/dn1j6dpd7/image/upload/v1615368957/help/LiveChat_Lite.png

Http Long polling vs WebSocket for Chat-Based Platforms?

Consider a client and a server, the client can make a request to the server for the data using the HTTP request. Then the client waits for the server to provide the response. There will be a connection opened till the server has a response to send back. As the connection is open for a long time, that’s why we call it HTTP Long-Polling.

As soon the server has the response, it sends it back to the client and the connection gets closed. And the client makes another request and waits for the response, and this keeps on going in this way again and again.
The client keeps making the request at regular intervals.

Http Long-polling

There is a timeout for each request. So, when the client gets a response or a timeout occurs, the connection gets closed, then the client makes a new request, and starts waiting for the next response.

Long polling can be a bad choice for a large-scale chat system due to following reasons:

1- Resource intensive: The response can be empty because the server might not have any updates which are useful for the client. In that case, most of the requests might get an empty response. And in a few of the requests, the client gets the updates that are useful. So, there could be a massive portion of the network traffic dedicated to many unnecessary network calls.

2- Scalability: For each request or say for 100 requests a new connection or 100 new connections are established and then closed.

3- Decreased performance: As the number of open connections increases, the server’s resources will be strained, leading to slower response times and decreased system performance.
This can make it difficult to add new features or scale the system to accommodate more users.

A WebSocket is a persistent connection between a client and a server. WebSocket provides a bidirectional, full-duplex communications channel that operates over HTTP through a single TCP/IP socket connection. At its core, the WebSocket protocol facilitates message passing between a client and server. The WebSocket endpoint is represented as ws:// or wss://.

First, the client does the WebSocket handshaking with the server, then, the TCP connection gets established between the server and the client through the WebSocket. The way in which socket works is slightly different from how HTTP works, the status code 101 denotes the switching protocol in WebSocket.

Once established, the channel is kept open, offering a fast connection with low latency.

This reduces the overhead of handshaking repeatedly, as we do the handshaking only once at the beginning. This way it reduces the overhead.

WebSocket connection

Once a successful connection between the client and server is established, either of them can send data, being independent of each other, which continues unless either of the parties closes the connection. Thereby offering bi-directional communication.

WebSocket is great for real-time and long-lived communications and low-latency interactions. HTTP is great for occasional data exchange and interactions initiated by the client.

WebSocket is a stateful protocol, which means the connection between client and server will stay alive until it is terminated by either party (client or server). After closing the connection by either of the client or server, the connection is terminated from both ends.

How is WebSocket Connection established?

The client sends a normal HTTP request that contains two special headers: Upgrade: WebSocket and Connection: Upgrade, along with any other required request data.

The server then sends back an HTTP 101 Switching Protocols status code indication to the client, which says: “We used to talk in HTTP, but we’ll use something else in the future.” Along with this HTTP 101 response, it also sends the Upgrade: WebSocket and Connection: Upgrade headers again.

After this, the handshake is complete and the WebSocket connection is set up.

Why WebSocket Is Suitable for Chat-Based Platforms?

  1. Bi-directional communication:
  • WebSocket allows for bi-directional, full-duplex communication between the server and the client.
  • Both parties can send and receive messages simultaneously, making it ideal for real-time communication in chat applications.

2. Persistent connection:

WebSocket offers a persistent connection between the server and the client.

  • This eliminates the need for the client to repeatedly send requests to the server to check for new messages.
  • As a result, messages can be delivered instantly to the client as soon as they are sent, reducing network traffic, and improving performance.

3. Compatibility:

WebSocket also provides a cross-platform solution, allowing clients to build applications that can be used across different devices and platforms. WebSocket is supported by most modern web browsers and can also be used in mobile applications.

4. Scalability:

  • WebSocket can handle a large number of concurrent connections, making it suitable for chat applications with a large number of users.
  • Developers can use load balancers and other technologies to distribute WebSocket connections across multiple servers, further enhancing scalability.

5. Real-time user experience:

  • WebSocket’s low-latency communication enables real-time interactions between users in chat applications.
  • This provides a seamless and engaging user experience, which is essential for chat-based platforms.
  • This feature is ideal for chat-based initiatives that require instantaneous messaging.

6. Reliability:

  • It is reliable and suitable for chat-based applications where users expect timely and accurate message delivery.

How to create a WebSocket Client on Android?

OkHttp is an open-source library developed by Square. One of the options to develop a WebSocket android client is using OkHttp3. It has a WebSocketListener class that implements all required methods for creating a WebSocket connection, sending a message, handling errors and finally closing a WebSocket connection. More details on the class and its methods are available here.

internal class WebsocketClient(
val websocketConfig: WebsocketConfig,
private val websocketConnectionRetry: RetryComponent = RetryComponent()
) : WebSocketListener() {

companion object {
@Volatile
@VisibleForTesting
var webSocket: WebSocket? = null
private set

const val NORMAL_CLOSURE_STATUS = 1000
const val TAG = “WebsocketClient”
}

var webSocketMessageFlow = MutableSharedFlow<WebsocketReceiveMessageResponse>()
private set

fun startWebsocketConnection() {
webSocket = webSocket ?: synchronized(this) {
webSocket ?: getOkHttpClient()
.newBuilder()
.pingInterval(websocketConfig.pingPongInterval.toLong(), TimeUnit.SECONDS)
.build()
.newWebSocket(getWebsocketRequest(websocketConfig.url), this)
}

Log.d(
TAG,
“Initiate Start connection, url= ${websocketConfig.url} ping pong=${websocketConfig.pingPongInterval}”

)
}

fun getOkHttpClient(): OkHttpClient {
return OkHttpClient.Builder()
.addInterceptor(ClientConfig.getInterceptor())
.connectTimeout(Constants.LONG_POLLING_DEFAULT_TIMEOUT, TimeUnit.SECONDS)
.writeTimeout(Constants.LONG_POLLING_DEFAULT_TIMEOUT, TimeUnit.SECONDS)
.readTimeout(Constants.LONG_POLLING_DEFAULT_TIMEOUT, TimeUnit.SECONDS)
.build()
}

fun closeWebsocketConnection() {
webSocket?.close(
NORMAL_CLOSURE_STATUS,
“Websocket connection closed normally”
)
webSocket = null

Log.d(TAG, “Initiate Close connection, url= ${websocketConfig.url}”)
}

override fun onOpen(webSocket: WebSocket, response: Response) {
websocketConnectionRetry.resetRetryConfiguration()
Log.d(TAG, “Connection opened, url=${websocketConfig.url}”)
}

override fun onMessage(webSocket: WebSocket, text: String) {
Log.d(TAG, “Received message, text=$text”)
CoroutineScope(Dispatchers.IO).launch {
processWebsocketResponse(parseMessageReceivedFromWebsocket(text))
}
}

private suspend fun processWebsocketResponse(response: WebsocketGenericResponse?) {
response?.let {
when (it) {
is WebsocketReceiveMessageResponse -> {
webSocketMessageFlow.emit(it)
}
is WebsocketErrorMessageResponse -> {
// handle error
}
}
}
}

override fun onClosed(webSocket: WebSocket, code: Int, reason: String) {
Log.d(TAG, “Connection closed, url=${websocketConfig.url}”)
}

override fun onFailure(webSocket: WebSocket, throwable: Throwable, response: Response?) {
Log.d(TAG, “Retry initiated for url=${websocketConfig.url}”)
websocketConnectionRetry.startRetry(::startWebsocketConnection, websocketConfig, Dispatchers.IO)
}

fun sendMessage(message: String) {
Log.d(TAG, “Send message, url=${websocketConfig.url} message=$message”)
webSocket?.send(message)
}
}

OkHttp WebSocket Lifecycle

Upon normal operation each web socket progresses through a sequence of states:

  • Connecting: this is the initial state of each web socket. Messages may be enqueued, but they will not be transmitted until the web socket is open.
  • Open: this specifies the web socket has been accepted by the remote peer and is fully operational. Messages in either direction are enqueued for immediate transmission.
  • Closing: this state specifies that one of the peers on the web socket has initiated a graceful shutdown. The web socket will continue to transmit already-enqueued messages but will refuse to enqueue new ones.
  • Closed: this state specifies the web socket has transmitted all of its messages and has received all messages from the peer.
  • Canceled: this state specifies the web socket connection has failed. Messages that were successfully enqueued by either peer may not have been transmitted to the other.

Note: With Websocket listener’s onMessage(webSocket: WebSocket, text: String), we can receive any kind of message, either a successful response or an error response, so we must have a provision to handle and parse any kind of response received.
Also, if we want to send a json object over websocket, we should stringify json object to string first, before invoking websocket send.

Web sockets may fail due to HTTP upgrade problems, connectivity problems, or if either peer chooses to short-circuit the graceful shutdown process:

  • .close — Attempts to initiate a graceful shutdown of this web socket. Any already-enqueued messages will be transmitted before the close message is sent but subsequent calls to send will return false and their messages will not be enqueued.
  • .cancel — Immediately and violently release resources held by this web socket, discarding any enqueued messages. This does nothing if the web socket has already been closed or canceled.

Conclusion

Overall, WebSocket provides a reliable, efficient, and easy-to-use protocol for building chat-based platforms.

WebSocket can be used for real-time updated or continuous streams of data that are being transmitted over the network.
If the requirement is to fetch old data, or want to get the data only once to process it with an application, HTTP protocol should be used. Old data which is not required very frequently or fetched only once can be queried by the simple HTTP request, so in this scenario, WebSocket is not needed.

References

1- https://square.github.io/okhttp/4.x/okhttp/okhttp3/-web-socket-listener/
2- https://medium.com/walmartglobaltech/exploring-websocket-and-its-brief-implementation-for-android-cc461597e1dc

Leveraging Web sockets for powering chat platforms was originally published in Walmart Global Tech Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Article Link: Leveraging Web sockets for powering chat platforms | by Monika Kumar Jethani | Walmart Global Tech Blog | Dec, 2023 | Medium