Understanding the event driven nature of Node.js
Fundamental Operators #
To understand at the fundamental level how the event driven mechanism in node.js works, lets start with the fundamental design of the computer and understand how the computer works. Still looking at the very high level and ignoring the magic that the OS provides, there is something called the
One basic operator fundamentally provided at the hardware layer is the conditional jump operator. This provides the basics of all the other functionality that traditional programming brings. What this provides is the ability to jump to a different line in code if a certain condition is true. We use it almost as is for the
if-else conditional like shown in Figure 2.
We also use it for the looping dispatch where we have a jump back to somewhere earlier in code. This is understood cleanest in the
do..while loop in some programming languages, but functionally the
for and the
while loops are built over the same construct. This type of construct is shown in Figure 3.
The other constructs like functions are just two jump operators one goes into the special piece of code and the other comes back to the main executing code. But it still fundamentally is the same operator. The functional construct is shown in Figure 4.
Asynchronous programming #
Asynchronous programming is fundamentally different from regular programming. In asynchronous programming, we have two things happening in parallel. These two things not necessarily mean two CPUs/threads running in parallel. There could be multiple things that happen together. The hard disk has a spooler that actually writes from the RAM to the disk. The CPU is available to do something else while the write is happening. Similar is the case with
A piece of asynchronous code logically looks like what is shown in Figure 5. This is exactly how we do multi-threaded programming in most use cases. When the control reaches an asynchronous method, the second thread or hardware picks up the task and continue running in parallel. It could access the same resources and run at same or different speeds which the OS does not give the control over. It is very difficult to understand it across compiler optimizations and the assembly format of the hardware anyways. In traditional multi-threaded programming, at some point if we want to do some work in parallel we would fork a thread and then have the two threads running in parallel until we are done with the parallel operation. The
join operation is what follows next where the faster thread waits for the slower thread to catch up and the resources are merged and the execution continues.
To achieve the behavior that logically looks like the multi-threaded behavior in Figure 5 but can reuse the same flow of execution, the node runtime maintains a queue of tasks that the other pieces of hardware complete. So it never waits for the disk pooler or the network card. As soon as it reaches a statement that needs to send some work to the disk pooler, it takes the scope of the callback and ensures that it remains in the RAM by marking that used. Then the runtime continues. The other task is triggered and node does not wait for it. When the node reaches an end of the current task or things like an
await statement, it looks at the queue of tasks that are completed by the other pieces of hardware that are waiting for further execution. It picks up the first task and attaches its scope to start execution. Therefore the main thread only halts if there is no work to do and in those cases if there is no timers or other hardware scheduled to do work, the node program quits. Figure 6 shows how this thing works. As programmers the act of picking up other task by the node process is something that may not impact us and therefore we remain oblivious to this event driven nature underneath. We can continue to believe that the current flow waits for the other hardware when we write
await as logically that is correct. But instead of another thread modifying another variable, it is the same thread that performs this logic.
async await enable programming like Figure 5 without the overhead of making the thread wait. This concept has been so powerful that almost all languages have upended their standard libraries to add support for these features.
Node's event driven architecture is neither new nor unique. But node's standard library was the first to embrace the event driven reality of working with multiple pieces of hardware at a large scale and forced all programmers to care about it. It has reaped the rewards to be hottest technology on the servers 10+ years and still running.
This post is accompanying my speech at Byteconf JS 2019.