Why not use socket.write callback in Node.js?

该文章根据 CC-BY-4.0 协议发表,转载请遵循该协议。
本文地址:https://fenying.net/en/post/2025/03/16/why-not-use-socket-write-callback-in-nodejs/

This article is about a mistake when using the callback parameter of the socket.write method in Node.js.

1declare class Socket extends EventEmitter {
2
3    write(buffer: Uint8Array | string, cb?: (err?: Error) => void): boolean;
4}
ts

In Node.js, the socket.write method is used to write data through the Socket. Here are the two things you need to know about it:

  • When writing data to the Socket, if there is enough space in the kernel buffer, the socket.write method will put the data into kernel buffer and then return true. Otherwise it will return false, and the data will be written to the user space buffer.

    After the data is written to the user space buffer, Node.js will wait for the kernel buffer to be available, and then write the data to the kernel buffer. After all the data is written to the kernel buffer and the kernel buffer becomes available again, Node.js will emit a drain event, indicating that it is ready to write more data.

  • The last parameter of the socket.write method is a callback function, which will be called after the data is received and acknowledged by the other end.

Well, it seems fine. So what’s the problem we encountered? Let’s see the background story first.

We have a client for an application layer protocol. The protocol is like the HTTP protocol, which is request-response communication mode. The requests don’t have a unique identifier, so we can only match the requests and responses by the order of sending requests.

And the problem is: when the client sends a request r1, the server does not respond in time. And then if the client sends another request r2, the callback of r1 may be called with the response of r2. Yes, it’s disordered, And the problem is not always reproducible, only in some cases.

After our investigation, we found that the problem was caused by the way how the callback function of the socket.write method was used. Here is the simplified pseudo-code:

 1class Client {
 2
 3    public constructor(private readonly _socket: Socket) {}
 4
 5    private _requests: Request[] = [];
 6
 7    private _queueRequest(request: Request): void {
 8
 9        this._requests.push(request);
10    }
11
12    public send(request: Request): void {
13
14        this._socket.write(request.toBuffer(), (err) => {
15
16            if (err) {
17
18                return;
19            }
20
21            this._requests.push(request);
22        });
23    }
24}
ts

The send method will write the data to the socket, and the put the request into the _requests queue only when the callback function is called without errors. I can’t remember why we did this, perhaps for these 2 reasons:

  • We misread the document and thought the callback function would be called after the data was written to the kernel buffer.
  • We wanna make sure the request is sent successfully before putting it into the queue.

I would prefer the second reason, but even so, it doesn’t looks wrong.

So what’s the problem?

Let’s review the TCP protocol. The TCP is a stream-oriented, full-duplex protocol. It ensures the data is delivered in order, lossless, and without duplication. All data is sent in segments, and each segment has a sequence number. The receiver will acknowledge every segment it receives, and the sender will retransmit the unacknowledged segments.

When the sender sends a segment, the receiver will not acknowledge it immediately. Instead, it will wait for a while to see if there are more segments coming. If there are more segments coming, the receiver will acknowledge the last segment it received. This is called the delayed acknowledgment.

However, we missed the point that the receiver will deliver the data to the application layer ASAP after receiving it, not after sending the acknowledgment. This tells, the application layer can process the requests and send the responses before sending acknowledgments.

Now, we got the truth.

The callback function of the socket.write method is called only after the one of the segments is acknowledged, but the response of the requests could be sent back before the acknowledgment. Yep, the callback function is not a good place to put the request into the queue.

Besides, after we checked the source code of the response-processing logic, we found that the response would be dropped if no request could be matched in the queue, instead of shutting down the connection and throwing an exception. That makes the client drop the response of the request r1 before putting it into the queue, and then the response of the request r2 will be passed to the request r1.

At last, the solution is simple: put the request into the queue immediately after writing the data to the socket (buffer). This will ensure the requests and responses are matched correctly.

🤷‍♂️So, WATCH OUT when using the callback parameter of socket.write. Actually, it’s unnecessary used for most cases, because the error event is here for error handling, except for other needs to ensure that the other side has received the segments.

Translations: