Inter-process communication between Javascript and Python
Often, different programming languages are good at different things. To this end, it is sometimes desirable to write different parts of a program in different languages. In no situation is this more apparent than with an application implementing some kind of (ethical, I hope) AI-based feature.
I'm currently kinda-sort-mayenbe thinking of implementing a lightweight web interface backed by an AI model (more details if the idea comes to fruition), and while I like writing web servers in Javascript (it really shines with asynchronous input/output), AI models generally don't like being run in Javascript very much - as I have mentioned before, Tensorflow.js has a number of bugs that mean it isn't practically useful for doing anything serious with AI.
Naturally, the solution then is to run the AI stuff in Python (yeah, Python sucks - believe me I know) since it has the libraries for it, and get Javascript/Node.js to talk to the Python subprocess via inter-process communication, or IPC.
While Node.js has a fanceh message-passing system it calls IPC, this doesn't really work when communicating with processes that don't also run Javascript/Node.js. To this end, the solution is to use the standard input (stdin) and standard output (stdout) of the child process to communicate:
(Above: A diagram of how the IPC setup we're going for works. Editing file)
This of course turned out to be more nuanced and complicated than I expected, so I thought I'd document it here - especially since the Internet was very unhelpful on the matter.
Let's start by writing the parent Node.js script. First, we need to spawn that Python subprocess, so let's do that:
import { spawn } from 'child_process';
const python = spawn("path/to/child.py", {
stdio: [ "pipe", "pipe", "inherit" ]
});
...where we set stdin and stdout to pipe
mode - which let's us interact with the streams - and the standard error (stderr) to inherit
mode, which allows it to share the parent process' stderr. That way errors in the child process propagate upwards and end up in the same log file that the parent process sends its output to.
If you need to send the Python subprocess some data to start with, you have to wait until it is initialised to send it something:
python.on(`spawn`, () => {
console.log(`[node:data:out] Sending initial data packet`);
python.stdin.write(`start\n`);
});
...an easier alternative than message passing for small amounts of data would be to set an environment variable when you call child_process.spawn
- i.e. env: { key: "value" }
in the options object above.
Next, we need to read the response from the Python script. Let's do that next:
import nexline from 'nexline'; // Put this import at the top of the file
const reader = nexline({
input: python.stdout,
})
for await(const line of reader) {
console.log(`[node:data:in] ${line}`)
}
The simplest way to do this would be to listen for the data
event on python.stdout
, but this does not guarantee that each chunk that arrives is actually a line of data, since data between processes is not line-buffered like it is when displaying content in the terminal.
To fix this, I suggest using one of my favourite npm packages: nexline
. Believe it or not, handling this issue efficiently with minimal buffering is a lot more difficult than it sounds, so it's just easier to pull in a package to do it for you.
With a nice little for await..of
loop, we can efficiently read the responses from the Python child process.
If you were doing this for real, I would suggest wrapping this in an EventEmitter
(Node.js) / EventTarget
(WHAT WG browser spec, also available in Node.js).
Python child process
That's basically it for the child process, but what does the Python script look like? It's really quite easy actually:
import sys
sys.stderr.write(f"[python] hai\n")
sys.stderr.flush()
count = 0
for line in sys.stdin:
sys.stdout.write(f"boop" + str(count) + "\n")
sys.stdout.flush()
count += 1
Easy! We can simply iterate sys.stdin
to read from the parent Node.js process.
We can write to sys.stdout
to send data back to the parent process, but it's important to call sys.stdout.flush()
! Node.js doesn't have an equivalent 'cause it's smart, but in Python it may not actually send the response until who-know-when (if at all) unless you call .flush()
to force it to. Think of it as batching graphics draw calls to increase efficiency, but in this case it doesn't work in our favour.
Conclusion
This is just a quick little tutorial on how to implement Javascript/Node.js <--> Python IPC. We deal im plain-text messages here, but I would recommend using JSON - JSON.stringify()
/JSON.parse()
(Javascript) | json.dumps()
/ json.loads
(Python) - to serialise / deserialise messages to ensure robustness. JSON by default contains no newline characters and escapes any present into \n
, so it should be safe in this instance.
See also JSON Lines, a related specification.
Until next time!
Code
index.mjs
:
#!/usr/bin/env node
"use strict";
import { spawn } from 'child_process';
import nexline from 'nexline';
///
// Spawn subprocess
///
const python = spawn("/tmp/x/child.py", {
env: { // Erases the parent process' environment variables
"TEST": "value"
},
stdio: [ "pipe", "pipe", "inherit" ]
});
python.on(`spawn`, () => {
console.log(`[node:data:out] start`);
python.stdin.write(`start\n`);
});
///
// Send stuff on loop - example
///
let count = 0;
setInterval(() => {
python.stdin.write(`interval ${count}\n`);
console.log(`[node:data:out] interval ${count}`);
count++;
}, 1000);
///
// Read responses
///
const reader = nexline({
input: python.stdout,
})
for await(const line of reader) {
console.log(`[node:data:in] ${line}`)
}
child.py
:
#!/usr/bin/env python3
import sys
sys.stderr.write(f"[python] hai\n")
sys.stderr.flush()
count = 0
for line in sys.stdin:
# sys.stderr.write(f"[python:data:in] {line}\n")
# sys.stderr.flush()
sys.stdout.write(f"boop" + str(count) + "\n")
sys.stdout.flush()
count += 1