Multiple Interpreters HOWTO¶
In this HOWTO document we’ll look at how to take advantage of
multiple interpreters in a Python program. We will focus on doing
so in Python code, through the stdlib concurrent.interpreters
module.
This document has 2 parts: first a brief introduction and then a tutorial.
The tutorial covers the basics of using multiple interpreters, as well as how to communicate between interpreters. It finishes with some extra information.
See also
Note
This page is focused on using multiple interpreters. For information about changing an extension module to work with multiple interpreters, see Isolating Extension Modules.
Introduction¶
You can find a thorough explanation about what interpreters are and what they’re for in the concurrent.interpreters docs.
In summary, think of an interpreter as the Python runtime’s execution context. You can have multiple interpreters in a single process, switching between them at any time. Each interpreter is almost completely isolated from the others.
Note
Interpreters in the same process can technically never be strictly isolated from one another since there are few restrictions on memory access within the same process. The Python runtime makes a best effort at isolation but extension modules may easily violate that. Therefore, do not use multiple interpreters in security-sensitive situations, where they shouldn’t have access to each other’s data.
That isolation facilitates a concurrency model based on independent logical threads of execution, like CSP or the actor model.
Each actual thread in Python, even if you’re only running in the main thread, has its own current execution context. Multiple threads can use the same interpreter or different ones.
Why Use Multiple Interpreters?¶
These are the main benefits:
isolation supports a human-friendly concurrency model
isolation supports full multi-core parallelism
avoids the data races that make threads so challenging normally
There are some downsides and temporary limitations:
the concurrency model requires extra rigor; it enforces higher discipline about how the isolated components in your program interact
not all PyPI extension modules support multiple interpreters yet
the existing tools for passing data between interpreters safely are still relatively inefficient and limited
actually sharing data safely is tricky (true for free-threading too)
all necessary modules must be imported separately in each interpreter
relatively slow startup time per interpreter
non-trivial extra memory usage per interpreter
Nearly all of these should improve in subsequent Python releases.
Tutorial: Basics¶
First of all, keep in mind that using multiple interpreters is like using multiple processes. They are isolated and independent from each other. The main difference is that multiple interpreters live in the same process, which makes it all more efficient and uses fewer system resources.
Each interpreter has its own __main__
module, its own
sys.modules
, and, in fact, its own set of all runtime state.
Interpreters pretty much never share objects and very little data.
Running Code in an Interpreter¶
Running code in an interpreter is basically equivalent to running the
builtin exec()
using that interpreter:
from concurrent import interpreters
interp = interpreters.create()
interp.exec('print("spam!")')
# prints: spam!
interp.exec("""if True:
...
print('spam!')
...
""")
# prints: spam!
(See Interpreter.exec()
.)
Calls are also supported. See the next section.
When Interpreter.exec()
runs, the code is executed using the
interpreter’s __main__
module, just like a Python process
normally does when invoked from the command-line:
from concurrent import interpreters
print(__name__)
# prints: __main__
interp = interpreters.create()
interp.exec('print(__name__)')
# prints: __main__
In fact, a comparison with python -c
is quite direct:
from concurrent import interpreters
############################
# python -c 'print("spam!")'
############################
interp = interpreters.create()
interp.exec('print("spam!")')
# prints: spam!
It’s also fairly easy to simulate the other forms of the Python CLI:
from concurrent import interpreters
from textwrap import dedent
SCRIPT = """
print('spam!')
"""
with open('script.py', 'w') as outfile:
outfile.write(SCRIPT)
##################
# python script.py
##################
interp = interpreters.create()
with open('script.py') as infile:
script = infile.read()
interp.exec(script)
# prints: spam!
##################
# python -m script
##################
interp = interpreters.create()
interp.exec('import os, sys; sys.path.insert(0, os.getcwd())')
interp.exec(dedent("""
import runpy
runpy.run_module('script')
"""))
# prints: spam!
That’s more or less what the python
executable is doing for each
of those cases.
You can also exec any function that doesn’t take any arguments, nor returns anything. Closures are not allowed but other nested functions are. It works the same as if you had passed the script corresponding to the function’s body:
from concurrent import interpreters
def script():
print('spam!')
def get_script():
def nested():
print('eggs!')
return nested
if __name__ == '__main__':
interp = interpreters.create()
interp.exec(script)
# prints: spam!
script2 = get_script()
interp.exec(script2)
# prints: eggs!
Any referenced globals are resolved relative to the interpreter’s
__main__
module, just like happens for scripts, rather than
the original function’s module:
from concurrent import interpreters
def script():
print(__name__)
if __name__ == '__main__':
interp = interpreters.create()
interp.exec(script)
# prints: __main__
One practical difference is that with a script function you get syntax highlighting in your editor. With script text you probably don’t.
Calling a Function in an Interpreter¶
You can just as easily call a function in another interpreter:
from concurrent import interpreters
def spam():
print('spam!')
if __name__ == '__main__':
interp = interpreters.create()
interp.call(spam)
# prints: spam!
(See Interpreter.call()
.)
In fact, nearly all Python functions and callables are supported, with the notable exclusion of closures. We’ll focus on plain functions for the moment.
Relative to Interpreter.call()
, there are five kinds of functions:
builtin functions
extension module functions
“stateless” user functions
stateful user functions, defined in
__main__
stateful user functions, not defined in
__main__
A “stateless” function is one that doesn’t use any globals. It also can’t be a closure. It can have parameters but not argument defaults. It can have return values. We’ll cover arguments and returning soon.
Some builtin functions, like exec()
and eval()
, use the
current globals as a default (or implicit
) argument
value. When such functions are called directly with
Interpreter.call()
, they use the __main__
module’s
__dict__
as the default/implicit “globals”:
from concurrent import interpreters
interp = interpreters.create()
interp.call(exec, 'print(__name__)')
# prints: __main__
interp.call(eval, 'print(__name__)')
# prints: __main__
Note that, for some of those builtin functions, like globals()
and even eval()
(for some expressions), the return value may
be “unshareable” and the call will
fail.
In each of the above cases, the function is more-or-less copied, when sent to the other interpreter. For most of the cases, the function’s module is imported in the target interpreter and the function is pulled from there. The exceptions are cases (3) and (4).
In case (3), the function is also copied, but efficiently and only
temporarily (long enough to be called) and is not actually bound to
any module. This temporary function object’s __name__
will
be set to match the code object, but the __module__
,
__qualname__
, and other attributes are not guaranteed to
be set. This shouldn’t be a problem in practice, though, since
introspecting the currently running function is fairly rare.
We’ll cover case (4) in the next section.
In all the cases, the function will get any global variables from the module (in the target interpreter), like normal:
from concurrent import interpreters
with open('mymod.py', 'w') as outfile:
outfile.write('def spam(): print(__name__)')
from mymod import spam
if __name__ == '__main__':
interp = interpreters.create()
interp.call(spam)
# prints: mymod
##########
# mymod.py
##########
def spam():
print(__name__)
Note, however, that in case (3) functions rarely look up any globals.
Functions Defined in __main__¶
Functions defined in __main__
are treated specially by
Interpreter.call()
. This is because the script or module that
ran in the main interpreter will have not run in other interpreters,
so any functions defined in the script would only be accessible by
running it in the target interpreter first. That is essentially
how Interpreter.call()
handles it.
If the function isn’t found in the target interpreter’s __main__
module then the script that ran in the main interpreter gets run in the
target interpreter, though under a different name than “__main__”.
The function is then looked up on that resulting fake __main__ module.
For example:
from concurrent import interpreters
def spam():
print(__name__)
if __name__ == '__main__':
interp = interpreters.create()
interp.call(spam)
# prints: <fake __main__>
The dummy module is never added to sys.modules
, though it may
be cached away internally.
This approach with a fake __main__
module means we can avoid
polluting the __main__
module of the target interpreter. It
also means global state used in such a function won’t be reflected
in the interpreter’s __main__
module:
from concurrent import interpreters
from textwrap import dedent
total = 0
def inc():
global total
total += 1
def show_total():
print(total)
if __name__ == '__main__':
interp = interpreters.create()
interp.call(show_total)
# prints: 0
interp.call(inc)
interp.call(show_total)
# prints: 1
interp.exec(dedent("""
try:
print(total)
except NameError:
pass
else:
raise AssertionError('expected NameError')
"""))
interp.exec('total = -1')
interp.exec('print(total)')
# prints: -1
interp.call(show_total)
# prints: 1
interp.call(inc)
interp.call(show_total)
# prints: 2
interp.exec('print(total)')
# prints: -1
interp.exec('import sys; sys.stdout.flush()')
print(total)
# prints: 0
Note that the recommended if __name__ == '__main__`:
idiom is
especially important for such functions, since the script will be
executed with __name__
set to something other than
__main__
. Thus, for any code you don’t want run repeatedly,
put it in the if block. You will probably only want functions or
classes outside the if block.
Another thing to keep in mind is that, when you run the REPL or run
python -c ...
, the code that runs is essentially unrecoverable. The
contents of the __main__
module cannot be reproduced by executing
a script (or module) file. Consequently, calling a function defined
in __main__
in these cases will probably fail.
Here’s one last tip before we move on. You can avoid the extra
complexity involved with functions (and classes) defined in
__main__
by simply not defining them in __main__
.
Instead, put them in another module and import them in your script.
Calling Methods and Other Objects in an Interpreter¶
Pretty much any callable object may be run in an interpreter, following the same rules as functions:
from concurrent import interpreters
with open('mymod.py', 'w') as outfile:
outfile.write("""if True:
class Spam:
@classmethod
def modname(cls):
print(__name__)
def __call__(self):
print(__name__)
def spam(self):
print(__name__)
""")
from mymod import Spam
class Eggs:
@classmethod
def modname(cls):
print(__name__)
def __call__(self):
print(__name__)
def eggs(self):
print(__name__)
if __name__ == '__main__':
interp = interpreters.create()
spam = Spam()
interp.call(Spam.modname)
# prints: mymod
interp.call(spam)
# prints: mymod
interp.call(spam.spam)
# prints: mymod
eggs = Eggs()
res = interp.call(Eggs.modname)
# prints: <fake __main__>
assert res is None
res = interp.call(eggs)
# prints: <fake __main__>
assert res is None
res = interp.call(eggs.eggs)
# prints: <fake __main__>
assert res is None
##########
# mymod.py
##########
class Spam:
@classmethod
def modname(cls):
print(__name__)
def __call__(self):
print(__name__)
def spam(self):
print(__name__)
Preparing and Reusing an Interpreter¶
When you use the Python REPL, it doesn’t reset after each command.
That would make the REPL much less useful. Likewise, when you use
the builtin exec()
, it doesn’t reset the namespace it uses
to run the code.
In the same way, running code in an interpreter does not reset that
interpreter. The next time you run code in that interpreter, the
__main__
module will be in exactly the state in which you
left it:
from concurrent import interpreters
interp = interpreters.create()
interp.exec("""if True:
answer = 42
""")
interp.exec("""if True:
assert answer == 42
print('okay!')
""")
# prints: okay!
Similarly:
from concurrent import interpreters
def set(value):
assert __name__ == '<fake __main__>', __name__
global answer
answer = value
def check(expected):
assert answer == expected
print('okay!')
if __name__ == '__main__':
interp = interpreters.create()
interp.call(set, 100)
interp.call(check, 100)
# prints: okay!
You can take advantage of this to prepare an interpreter ahead of time or use it incrementally:
from concurrent import interpreters
from textwrap import dedent
interp = interpreters.create()
# Prepare the interpreter.
interp.exec(dedent("""
# We will need this later.
import math
# Initialize the value.
value = 1
def double(val):
return val + val
"""))
# Do the work.
for _ in range(9):
interp.exec(dedent("""
assert math.factorial(value + 1) >= double(value)
value = double(value)
"""))
# Show the result.
interp.exec('print(value)')
# prints: 512
In case you’re curious, in a little while we’ll look at how to pass data in and out of an interpreter (instead of just printing things).
Sometimes you might also have values that you want to use in a later script in another interpreter. We’ll look more closely at that case in a little while: Initializing Globals for a Script.
Running in a Thread¶
A key point is that, in a new Python process, the runtime doesn’t create a new thread just to run the target script. It uses the main thread the process started with.
In the same way, running code in another interpreter doesn’t create (or need) a new thread. It uses the current OS thread, switching to the interpreter long enough to run the code.
To actually run in a new thread, you can do it explicitly:
from concurrent import interpreters
import threading
interp = interpreters.create()
def run():
interp.exec("print('spam!')")
t = threading.Thread(target=run)
t.start()
t.join()
# prints: spam!
def run():
def script():
print('spam!')
interp.call(script)
t = threading.Thread(target=run)
t.start()
t.join()
# prints: spam!
There’s also a helper method for that:
from concurrent import interpreters
def script():
print('spam!')
if __name__ == '__main__':
interp = interpreters.create()
t = interp.call_in_thread(script)
t.join()
# prints: spam!
Handling Uncaught Exceptions¶
Consider what happens when you run a script at the commandline and there’s an unhandled exception. In that case, Python will print the traceback and the process will exit with a failure code.
The behavior is very similar when code is run in an interpreter.
The traceback gets printed and, rather than a failure code,
an ExecutionFailed
exception is raised:
from concurrent import interpreters
interp = interpreters.create()
try:
interp.exec('1/0')
except interpreters.ExecutionFailed:
print('oops!')
# prints: oops!
The exception also has some information, which you can use to handle it with more specificity:
from concurrent import interpreters
interp = interpreters.create()
try:
interp.exec('1/0')
except interpreters.ExecutionFailed as exc:
if exc.excinfo.type.__name__ == 'ZeroDivisionError':
print('error ignored')
else:
raise # re-raise
# prints: error ignored
You can handle the exception in the interpreter as well, like you might do with threads or a script:
from concurrent import interpreters
from textwrap import dedent
interp = interpreters.create()
interp.exec(dedent("""
try:
1/0
except ZeroDivisionError:
print('error ignored')
"""))
# prints: error ignored
At the moment there isn’t an easy way to “re-raise” an unhandled exception from another interpreter. Here’s one approach that works for exceptions that can be pickled:
from concurrent import interpreters
from textwrap import dedent
interp = interpreters.create()
try:
try:
interp.exec(dedent("""
try:
1/0
except Exception as exc:
import pickle
data = pickle.dumps(exc)
class PickledException(Exception):
pass
raise PickledException(data)
"""))
except interpreters.ExecutionFailed as exc:
if exc.excinfo.type.__name__ == 'PickledException':
import pickle
raise pickle.loads(eval(exc.excinfo.msg))
else:
raise # re-raise
except ZeroDivisionError:
# Handle it!
print('error ignored')
# prints: error ignored
Managing Interpreter Lifetime¶
Every interpreter uses resources, particularly memory, that should be cleaned up as soon as you’re done with the interpreter. While you can wait until the interpreter is cleaned up when the process exits, you can explicitly clean it up sooner:
from concurrent import interpreters
interp = interpreters.create()
try:
interp.exec('print("spam!")')
finally:
interp.close()
# prints: spam!
Tutorial: Communicating Between Interpreters¶
Multiple interpreters are useful in the convenience and efficiency they provide for isolated computing. However, their main strength is in running concurrent code. Much like with any concurrency tool, there has to be a simple and efficient way to communicate between interpreters.
In this half of the tutorial, we explore various ways you can do so.
Call Args and Return Values¶
As already noted, Interpreter.call()
supports most callables,
as well as arguments and return values.
Arguments provide a way to send information to an interpreter:
from concurrent import interpreters
def show(arg):
print(arg)
def ident_full(a, /, b, c=42, *args, d, e='eggs', **kwargs):
print([a, b, c, d, e], args, kwargs)
def handle_request(req):
# do the work
print(req)
if __name__ == '__main__':
interp = interpreters.create()
interp.call(show, 'spam!')
# prints: spam!
interp.call(ident_full, 1, 2, 3, 4, 5, d=6, e=7, f=8, g=9)
# prints: [1, 2, 3, 6, 7] (4, 5) {'f': 8, 'g': 9}
req = {'x': 1, 'y': -1.3, 'misc': '???'}
interp.call(handle_request, req)
# prints: {'x': 1, 'y': -1.3, 'misc': '???'}
Return values are a way an interpreter can send information back:
from concurrent import interpreters
data = {}
def put(key, value):
data[key] = value
def get(key, default=None):
return data.get(key, default)
if __name__ == '__main__':
interp = interpreters.create()
res1 = interp.call(get, 'spam')
res2 = interp.call(get, 'spam', -1)
interp.call(put, 'spam', True)
res3 = interp.call(get, 'spam')
interp.call(put, 'spam', 42)
res4 = interp.call(get, 'spam')
print(res1, res2, res3, res4)
# prints: None -1 True 42
Don’t forget that the underlying data of few objects is actually shared between interpreters. That means that, nearly always, arguments are copied on the way in and return values on the way out. Furthermore, this will happen with each call, so it’s a new copy every time and no state persists between calls.
For example:
from concurrent import interpreters
data = {
'a': 1,
'b': 2,
'c': 3,
}
if __name__ == '__main__':
interp = interpreters.create()
interp.call(data.clear)
assert data == dict(a=1, b=2, c=3)
def update_and_copy(data, **updates):
data.update(updates)
return dict(data)
res = interp.call(update_and_copy, data)
assert res == dict(a=1, b=2, c=3)
assert res is not data
assert data == dict(a=1, b=2, c=3)
res = interp.call(update_and_copy, data, d=4, e=5)
assert res == dict(a=1, b=2, c=3, d=4, e=5)
assert data == dict(a=1, b=2, c=3)
res = interp.call(update_and_copy, data)
assert res == dict(a=1, b=2, c=3)
assert res is not data
assert data == dict(a=1, b=2, c=3)
print('okay!')
# prints: okay!
Supported and Unsupported Objects¶
There are other ways of communicating between interpreters, like
Interpreter.prepare_main()
and Queue
, which we’ll
cover shortly. Before that, we should talk about which objects are
supported by Interpreter.call()
and the others.
Most objects are supported. This includes anything that can be
pickled
, though for some pickleable objects we copy the
object in a more efficient way. Aside from pickleable objects, there
is a small set of other objects that are supported, such as
non-closure inner functions.
If you try to pass an unsupported object as an argument to
Interpreter.call()
then you will get a NotShareableError
with __cause__
set appropriately. Interpreter.call()
will
also raise that way if the return value is not supported. The same
goes for prepare_main()
and Queue.*
.
Relatedly, there’s an interesting side-effect to how we use a fake
module for objects with a class defined in __main__
. If you
pass the class to Interpreter.call()
, which will create an
instance in the other interpreter, then that instance will fail to
unpickle in the original interpreter, due to the fake module name:
from concurrent import interpreters
class Spam:
def __init__(self, x):
self.x = x
if __name__ == '__main__':
interp = interpreters.create()
try:
spam = interp.call(Spam, 10)
except interpreters.NotShareableError:
print('okay!')
else:
raise AssertionError('unexpected success')
# prints: okay!
Using Queues¶
concurrent.interpreters.Queue
is an implementation of the
queue.Queue
interface that supports safely and efficiently
passing data between interpreters:
from concurrent import interpreters
import time
def push(queue, value, *extra, delay=0.1):
queue.put(value)
for val in extra:
if delay > 0:
time.sleep(delay)
queue.put(val)
def pop(queue, count=1, timeout=None):
return tuple(queue.get(timeout=timeout)
for _ in range(count))
if __name__ == '__main__':
interp1 = interpreters.create()
interp2 = interpreters.create()
queue = interpreters.create_queue()
t = interp1.call_in_thread(push, queue,
'spam!', 42, 'eggs')
res1 = interp2.call(pop, queue)
res2 = interp2.call(pop, queue, 2)
try:
interp2.call(pop, queue, timeout=0)
except interpreters.ExecutionFailed:
pass
t.join()
print(res1)
# prints: ('spam!',)
print(res2)
# prints: (42, 'eggs')
Initializing Globals for a Script¶
When you call a function in Python, sometimes that function depends on
arguments, globals, and non-locals, and sometimes it doesn’t. In the
same way, sometimes a script you want to run in another interpreter
depends on some global variables. Setting them ahead of time on the
interpreter’s __main__
module, where the script will run,
is the simplest kind of communication between interpreters.
There’s a method that supports this: Interpreter.prepare_main()
.
It binds values to names in the interpreter’s __main__
module,
which makes them available to any scripts that run in the interpreter
after that:
from concurrent import interpreters
from textwrap import dedent
interp = interpreters.create()
def prep_interp(interp, name, value):
interp.prepare_main(**{name: value})
interp.exec(dedent(f"""
print('{name} =', {name})
"""))
prep_interp(interp, 'spam', 42)
# prints: spam = 42
with open('spam.txt', 'w') as outfile:
outfile.write(dedent("""\
line 1
line 2
EOF
"""))
with open('spam.txt') as infile:
interp.prepare_main(fd=infile.fileno())
interp.exec(dedent(f"""
for line in open(fd, closefd=False):
print(line, end='')
"""))
# prints: line 1
# prints: line 2
# prints: EOF
This is particularly useful when you want to use a queue in a script:
from concurrent import interpreters
from textwrap import dedent
interp = interpreters.create()
queue = interpreters.create_queue()
interp.prepare_main(queue=queue)
queue.put('spam!')
interp.exec(dedent("""
msg = queue.get()
print(msg)
"""))
# prints: spam!
For basic, intrinsic data it can also make sense to use an f-string or format string for the script and interpolate the required values:
from concurrent import interpreters
from textwrap import dedent
interp = interpreters.create()
def run_interp(interp, name, value):
interp.exec(dedent(f"""
{name} = {value!r}
print('{name} =', {name})
"""))
run_interp(interp, 'spam', 42)
# prints: spam = 42
with open('spam.txt', 'w') as outfile:
outfile.write(dedent("""\
line 1
line 2
EOF
"""))
with open('spam.txt') as infile:
interp.exec(dedent(f"""
for line in open({infile.fileno()}, closefd=False):
print(line, end='')
"""))
# prints: line 1
# prints: line 2
# prints: EOF
Pipes, Sockets, etc.¶
For the sake of contrast, let’s take a look at a different approach
to duplex communication between interpreters, which you can implement
using OS-provided utilities. You could do something similar with
subprocesses. The examples will use os.pipe()
, but the approach
applies equally well to regular files and sockets.
Keep in mind that pipes, sockets, and files work with bytes
,
not other objects, so this may involve at least some serialization.
Aside from that inefficiency, sending data through the OS will
also slow things down.
First, let’s use a pipe to pass a message from one interpreter to another:
from concurrent import interpreters
import os
READY = b'\0'
def send(fd_tokens, fd_data, msg):
# Wait until ready.
token = os.read(fd_tokens, 1)
assert token == READY, token
# Ready!
os.write(fd_data, msg)
if __name__ == '__main__':
interp = interpreters.create()
r_tokens, s_tokens = os.pipe()
r_data, s_data = os.pipe()
try:
t = interp.call_in_thread(
send, r_tokens, s_data, b'spam!')
os.write(s_tokens, READY)
msg = os.read(r_data, 20)
t.join()
finally:
os.close(r_tokens)
os.close(s_tokens)
os.close(r_data)
os.close(s_data)
print(msg)
# prints: b'spam!'
One interesting part of that is how the subthread blocked until we sent the “ready” token. In addition to delivering the message, the read end of the pipes acted like locks.
We can actually make use of that to synchronize execution between the
interpreters (and use Queue
the same way):
from concurrent import interpreters
import os
STOP = b'\0'
READY = b'\1'
INIT = b'\2'
_print = print
def print(*args, **kwargs):
_print(*args, **kwargs, flush=True)
def task(tokens):
r, s = tokens
# Wait for the controller.
token = os.read(r, 1)
if token == INIT:
# Do init stuff.
print('<', end='')
# Unlock the controller.
os.write(s, READY)
# Wait for the controller.
token = os.read(r, 1)
while token != STOP:
assert token == READY, token
# Do some work.
print('*', end='')
# Unlock the controller.
os.write(s, token)
# Wait for the controller.
token = os.read(r, 1)
# Clean up, and unblock the controller.
print('>', end='')
os.write(s, token)
steps = ['a', 'b', 'c', ..., 'x', 'y', 'z']
if __name__ == '__main__':
interp = interpreters.create()
r1, s1 = os.pipe()
r2, s2 = os.pipe()
try:
t = interp.call_in_thread(task, (r1, s2))
# Initialize the worker.
os.write(s1, INIT)
os.read(r2, 1)
steps = iter(steps)
# Do the first step.
for step in steps:
# Do the step.
print('...' if step is ... else step, end='')
break
for step in steps:
# Let the worker take a step.
os.write(s1, READY)
os.read(r2, 1)
# Do the step.
print('...' if step is ... else step, end='')
# Stop the worker.
os.write(s1, STOP)
# Wait for it to finish.
os.read(r2, 1)
t.join()
print()
finally:
os.close(r1)
os.close(s1)
os.close(r2)
os.close(s2)
# prints: <a*b*c*...*x*y*z>
You can also close the pipe ends and join the thread to synchronize.
Using os.pipe()
(or similar) to communicate between interpreters
is a little awkward, as well as inefficient.
Tutorial: Miscellaneous¶
concurrent.futures.InterpreterPoolExecutor¶
The concurrent.futures
module is a simple, popular concurrency
framework that wraps both threading and multiprocessing. You can also
use it with multiple interpreters, via
InterpreterPoolExecutor
:
from concurrent.futures import InterpreterPoolExecutor
def script():
return 'spam!'
with InterpreterPoolExecutor(max_workers=5) as executor:
fut = executor.submit(script)
res = fut.result()
print(res)
# prints: spam!
Capturing an interpreter’s stdout¶
While interpreters share the same default file descriptors for stdout,
stderr, and stdin, each interpreter still has its own sys
module,
with its own stdout
and so forth. That means it isn’t
obvious how to capture stdout for multiple interpreters.
The solution is a bit anticlimactic; you have to capture it for each interpreter manually. Here’s a basic example:
from concurrent import interpreters
import contextlib
import io
def task():
print('before', flush=True)
stdout = io.StringIO()
with contextlib.redirect_stdout(stdout):
# Do stuff in worker!
print('captured!', flush=True)
print('after', flush=True)
return stdout.getvalue()
if __name__ == '__main__':
interp = interpreters.create()
output = interp.call(task)
print(output, end='')
# prints: before
# prints: after
# prints: captured!
Here’s a more elaborate example:
from concurrent import interpreters
import contextlib
import errno
import os
import sys
import threading
def run_and_capture(fd, task, args, kwargs):
stdout = open(fd, 'w', closefd=False)
with contextlib.redirect_stdout(stdout):
return task(*args, **kwargs)
@contextlib.contextmanager
def running_captured(interp, task, *args, **kwargs):
def background(fd, stdout=sys.stdout):
# Pass through captured output to local stdout.
# Normally we would not ignore the encoding.
infile = open(fd, 'r', closefd=False)
while True:
try:
line = infile.readline()
except OSError as exc:
if exc.errno == errno.EBADF:
break
raise # re-raise
stdout.write(line)
bg = None
r, s = os.pipe()
try:
bg = threading.Thread(target=background, args=(r,))
bg.start()
t = interp.call_in_thread(run_and_capture, s, task, args, kwargs)
try:
yield
finally:
t.join()
finally:
os.close(r)
os.close(s)
if bg is not None:
bg.join()
def task():
# Do stuff in worker!
print('captured!', flush=True)
if __name__ == '__main__':
interp = interpreters.create()
print('before')
with running_captured(interp, task):
# Do stuff in main!
...
print('after')
# prints: before
# prints: captured!
# prints: after
Using a logger
can also help.