Description
- Node.js Version: 9.3.0 (and most probably earlier ones)
- OS: Linux 64 bits
- Scope (install, code, runtime, meta, other?): runtime
- Module (and version) (if relevant): stream [, http]
From nodejs docs (https://nodejs.org/api/stream.html#stream_event_readable):
Note: In general, the readable.pipe() and 'data' event mechanisms are easier to understand than the 'readable' event. However, handling 'readable' might result in increased throughput.
I decided to try this increased throughput. However, I ran into an issue of compatibility of 'readable'
with 'data'
, as well as inconsistent behaviour of 'readable'
on its own, in the case of http responses.
Issue
From https://nodejs.org/api/stream.html#stream_event_readable:
The 'readable' event is emitted when there is data available to be read from the stream. In some cases, attaching a listener for the 'readable' event will cause some amount of data to be read into an internal buffer.
The 'readable' event will also be emitted once the end of the stream data has been reached but before the 'end' event is emitted.
Effectively, the 'readable' event indicates that the stream has new information: either new data is available or the end of the stream has been reached. In the former case, stream.read() will return the available data. In the latter case, stream.read() will return null.
However, it seems that if 'readable'
is used in conjunction with 'data'
, then it emits false-positive end-of-stream events (i.e. where the stream.read()
is null
), along with no useful data (all useful data was consumed by 'data'
handler).
If it's used on its own, without 'data'
handler, it fails to detect the end-of-stream event in case when the stream is not delayed and short (i.e. when it's consumed in one go).
How to reproduce
Code
Here is a simple node snippet that creates a simple http server with a simple handler for 3 urls for different cases and then sends different requests in chain. In order not to introduce other unrelated errors (like ParseError
on TCP.onread
or Bad Request
responses), all content lengths were carefully provided for each request/response. The code is lengthy but simple (with hooks for envvars for easy changes).
The http server serves 3 types of responses: immediate short, immediate appended and delayed appended.
const PORT = 8080;
const http = require('http');
const HelloWorld = "Hello World!";
const HelloAgainLater = "Hello again later!";
const HelloAgainNow = "Hello again now!";
const HelloServer = "Hello, server!";
// these control whether you attach a data/readable handler to the response
const DataHandler = !process.env.NO_DATA;
const ReadableHandler = !process.env.NO_READABLE
var server = http.createServer((req, res) => {
console.log("Received", req.url);
if(req.url == "/delay"){
res.writeHead(200, {"Content-Length": "" + (HelloWorld.length + HelloAgainLater.length)});
res.write(HelloWorld);
setTimeout(() => {
res.end(HelloAgainLater)
}, 1000);
} else if(req.url == "/more"){
res.writeHead(200, {"Content-Length": "" + (HelloWorld.length + HelloAgainNow.length)});
res.write(HelloWorld);
res.end(HelloAgainNow);
} else {
res.writeHead(200, {"Content-Length": "" + (HelloWorld.length)});
res.write(HelloWorld);
res.end();
}
}).listen(PORT);
var request = function(path){
var opts = {
hostname: "localhost",
port: PORT,
path: path,
headers: {
"Content-Length": "" + HelloServer.length,
}
};
return new Promise((resolve) => {
console.log("Sending ", path);
var req = http.request(opts, (response) => {
response.on("error", (err) => {
console.log("error", err);
});
if(ReadableHandler){
response.on("readable", () => {
var data = response.read();
if(data){
data = data.toString();
}
console.log("readable", data);
});
}
if(DataHandler){
response.on("data", (data) => {
if(data){
data = data.toString();
}
console.log("data", data);
});
}
response.on("end", () => {
console.log("end\n");
resolve();
});
});
req.write(HelloServer);
req.end();
});
};
var r = request("/").then(() => {
return request("/more");
}).then(() => {
return request("/delay");
}).then(() => {
server.close();
});
How to run
-
For both
'data'
and'readable'
events:
$ cat test.js | docker run -i node:9.3.0
-
For only
'data'
event:
$ cat test.js | docker run -e NO_READABLE=true -i node:9.3.0
-
For only
'readable'
event:
$ cat test.js | docker run -e NO_DATA=true -i node:9.3.0
Outputs
Case 1 (both handlers)
Sending /
Received /
data Hello World!
readable null
end
Sending /more
Received /more
data Hello World!Hello again now!
readable null
end
Sending /delay
Received /delay
data Hello World!
readable null
data Hello again later!
readable null
end
Notice the first readable null
in Sending /delay
that tells us the stream is finished yet it's not.
Case 2 (just 'data'
event)
Received /
data Hello World!
end
Sending /more
Received /more
data Hello World!Hello again now!
end
Sending /delay
Received /delay
data Hello World!
data Hello again later!
end
No problems with 'data'
, as expected.
Case 3 (only 'readable'
event)
Sending /
Received /
readable Hello World!
end
Sending /more
Received /more
readable Hello World!Hello again now!
end
Sending /delay
Received /delay
readable Hello World!
readable Hello again later!
readable null
end
Notice the absence of any readable
lines with null
in Sending /
and Sending /more
.