@rafaelgss tech blog


Diagnostics Channel

In node v15 was landed a feature that should help a lot of APM vendors. diagnostics_channel has the objective to have a centralized channel of events between modules.

As blog release says:

diagnostics_channel is a new experimental module that provides an API to create named channels to report arbitrary message data for diagnostics purposes. With diagnostics_channel, Node.js core and module authors can publish contextual data about what they are doing at a given time. This could be the hostname and query string of a mysql query, for example. Just create a named channel with dc.channel(name) and call channel.publish(data) to send the data to any listeners to that channel.

This feature is similar to EventEmmiter, however diagnostics_channel has less overhead than publish a string-named event to an EventEmitter has. More info about in Why use this module instead of EventEmmiter section.

APM Vendors usually monkey patch every key module to capture the information which diagnostics_channel simply publishes as events. Monkey patching generally creates many additional costly closures and can be fragile so it’s much safer to rely on more intentional events.

Monkey patching has two major problems: creating closures around everything makes everything slow, and it’s also very fragile patches that can easily miss something and break the functionality. With diagnostics_channel, it should never break the publishing API and the overhead should be extremely minimal.

Usage

That is a simple api, let’s see some examples:

This one is a subscriber to root.caughtError channel.

const dc = require('diagnostics_channel')
const channel = dc.channel('root.caughtError')

channel.subscribe({ error }) => {
  console.error('One error was propagated through dc', error)
})

But, when it will be called?

Well, in some part of your application (even libraries used) this event should be called:

const dc = require('diagnostics_channel')
const channel = dc.channel('root.caughtError')

async function exampleFunction() {
  try {
    await throwableFunction()
  } catch (e) {
    channel.publish({ error: e })
  }
}

Or, let’s do a more reasonable example, we want to measure time elapsed of each query performed:

// module mysql.js
const dc = require('diagnostics_channel')
const channel = dc.channel('mysql.query')

MySQL.prototype.query = async function query(queryString, values, callback) {
  const start = Date.now() // You can do it with perf_hooks as well
  await this.doQuery(queryString, values, callback);
  const end = Date.now()

  // Broadcast query information whenever a query is done
  channel.publish({
    query: queryString,
    host: this.hostname,
    timeElapsed: end - start,
  })
}

It able us to measure bottlenecks in our code:

const dc = require('diagnostics_channel')
const channel = dc.channel('mysql.query')

channel.subscribe({ timeElapsed, query } => {
  if (timeElapsed > process.env.QUERY_THRESHOLD) {
    console.warn('Query slow: ', query, timeElapsed)
  }
})

Actually, most of the database modules already emit the warning on query slow if you set on the settings. This example is just to show you the main usage of this module.

After key modules support diagnostics_channel we are able to have a better observability/tracing of our application without add a lot of complexity in our nodejs code. Of course, this feature sounds better for APM Vendors.

Why use this module instead of EventEmmiter?

EventEmitter has an extra cost on every single publish to look up the handler set by the string event name. That’s not a huge deal for just a single run, but in a high-frequency scenario where the logic might be repeated thousands, or even millions, of times per second, it adds up fast. Additionally, the lookup cost always happens with EventEmitter while with diagnostics_channel it only happens when something is actually listening to that specific channel. The intent is for there to be hundreds or even thousands of these channels being reported to at any given time while there might be only a few of those channels being actively observed at any given time. The majority of the time there would be nothing to publish to so it’s been intentionally designed to do nothing at all in that case. This design makes it much more suitable as a data firehose whereas a typical EventEmitter is really only suited to a more limited set of events.

diagnostics_channel was created to publish/receive billions of events per second.

To clarify the above statement, in fewer words:

class MyEmitter extends EventEmitter {}

const myEmitter = new MyEmitter();

myEmitter.emit('event1');

myEmitter can publish any event name, so obviously the lookup takes more time than diagnostics_channel approach.

What next?

diagnostics_channel is still a experimental module. Since this module is just an API to provide information out-of-box the community should adopt it in their library, it means, add support to this feature in most library around nodejs ecossytem.

For instance, Fastify already support diagnostics_channel through plugin fastify-diagnostics-channel.


Acknowledgment

Thanks to @Qard that’s spend time working on it and made the review of this quick introduction.