At work I was trying to debug a fatal error, which looked something like this:
You have triggered an unhandledRejection, you may have forgotten to catch a Promise rejection:
TypeError: Cannot read properties of null (reading 'eventId')
at /var/lib/node/batch-processor/worker/s3_event_handler.js:42:27
at /var/lib/node/common-services/service/common/entity_service.js:85:16
at /var/lib/node/common/node-packages/rista-mongodb/node_modules/mongodb/lib/utils.js:348:28
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
And here is the code that was causing it:
// called with s3Key "events/2024-01-01/<object-id>/<reportname>"
const s3EventHandler = function (message, callback) {
const { s3Key } = message;
try {
const regex = new RegExp(`jobs/([0-9,-]{10})/([\\w]{24})/(.*)`);
const [, day, jobId, filename] = regex.exec(s3Key);
// find job details from the jonId
entityService.getById(
{
id: jobId,
entity: job,
schema: events
},
(err, job) => {
if (err) {
return callback(err);
}
const message = {
jobId,
batchId: job.batchId,
context: {
links: [{ day, folder: jobId, filename, s3Key }]
}
};
worker(message, callback);
}
);
} catch (e) {
callback(null, { success: true });
}
};
This worker processes a SQS message which could be produced by one of our applications, or other AWS service.
Destructure the jobId from the filename and fetch relevant job details from the DB.
The `worker` internally handles the actual job.
If you pause here and think there are many questions you might have:
who is populating the jobId in the s3key?
why are we fetching job details from the db if it can be triggered from other AWS services also?
do we always have the job details with us?
Now there is too much of context that is required to answer these questions but for brevity lets assume that jobId is a 32 bit string mongodb ObjectId. We generate this id, push message to queue and populate DB. At a later job this job is processed and we get that event and update db.
The problem was that sometimes the job was pushed to queue but the job was not present in the db. there is a silly error in the code:
const message = {
jobId,
batchId: job.batchId, // job could be undefined here
context: {
links: [{ day, folder: jobId, filename, s3Key }]
}
};
this causes a typeerror when job is not present in the DB. Of course a simple defensive check would fix this, but I am glad I made this error. There are a couple of things that I learned when uncovering the bug:
Process crash
To replicate this issue locally I manually pushed a queue to message with a non existent jobId. There was a type error and the process crashed, but whats weird was that the process was not crashing on production. Which is when I recollected we have pm2 running on production.
But, pm2 was supposed to recover from a crashed process and not to prevent a crash.
unhandledRejection
The
'unhandledRejection'
event is emitted whenever aPromise
is rejected and no error handler is attached to the promise within a turn of the event loop.
PM2 catches unhandled rejections on the process and does not send an exit signal; which is why the process was not crashing.
But how did a callback throw unhandled promise error?
My code was callback style, the getById
method was callback style and none of the calling methods used any async function. so where was this error coming from?
The getById
method was internally interacting with mongodb. I went down the rabbit hole of understanding the mongodb node driver we were using. That version was primarily focussed on async/await paradigm but it still had support for callbacks. And in order to support both they were using a lot of patterns like promisifying methods, wrapping synchronous function within a promise and so on. One such utility method was this node_modules/mongodb/lib/utils.js
:
function maybeCallback(promiseFn, callback) {
const PromiseConstructor = promise_provider_1.PromiseProvider.get();
const promise = promiseFn();
if (callback == null) {
if (PromiseConstructor == null) {
return promise;
}
else {
return new PromiseConstructor((resolve, reject) => {
promise.then(resolve, reject);
});
}
}
// ***** THE CULPRIT ******
promise.then(result => callback(undefined, result), error => callback(error));
return;
}
The error handler passed as second parameter to then
is the onRejected
handler which is called with promise
rejects, but what if if there were any errors in the callback
the catch handler is out of scope and is no longer bound.