# Process-Bound Python Environments
*Since version 2.2.0*
## Philosophy
In Erlang, processes are the fundamental unit of isolation. Each process has its own heap, mailbox, and lifecycle. When a process crashes, it takes its state with it and can be restarted clean by a supervisor.
erlang_python extends this philosophy to Python: **each Erlang process gets its own isolated Python environment**. Variables, imports, and objects defined in one process are invisible to others, even when using the same Python context.
This design enables:
- **Clean restarts**: Resetting Python state = terminating the Erlang process
- **Fault isolation**: A corrupted Python state crashes only its owning process
- **Supervision**: Standard OTP supervisors can manage Python-backed actors
- **Actor model**: Build stateful Python services that behave like gen_servers
## How It Works
When you call `py:exec/eval/call`, the library:
1. Looks up a process-local environment keyed by `{ContextPid, InterpreterId}`
2. Creates one if it doesn't exist (a Python `dict` inside the interpreter)
3. Uses that dict as the namespace for execution
4. Cleans up automatically when the Erlang process exits
```erlang
%% Process A
spawn(fun() ->
Ctx = py:context(1),
ok = py:exec(Ctx, <<"state = 'hello'">>),
{ok, <<"hello">>} = py:eval(Ctx, <<"state">>)
end).
%% Process B - same context, isolated state
spawn(fun() ->
Ctx = py:context(1),
%% state is undefined here - different process
{error, _} = py:eval(Ctx, <<"state">>)
end).
```
## OWN_GIL Mode
OWN_GIL contexts (Python 3.12+) provide true parallel execution with dedicated pthreads. Process-bound environments work with OWN_GIL, allowing multiple Erlang processes to share a single OWN_GIL context while maintaining isolated Python namespaces.
### Explicit Environment Creation
For OWN_GIL contexts, you can explicitly create and manage environments:
```erlang
%% Create an OWN_GIL context
{ok, Ctx} = py_context:start_link(1, owngil),
%% Create a process-local environment
{ok, Env} = py_context:create_local_env(Ctx),
%% Get the NIF reference for low-level operations
CtxRef = py_context:get_nif_ref(Ctx),
%% Execute code in the isolated environment
ok = py_nif:context_exec(CtxRef, <<"
class MyService:
def __init__(self):
self.counter = 0
def increment(self):
self.counter += 1
return self.counter
service = MyService()
">>, Env),
%% Call functions in the environment
{ok, 1} = py_nif:context_eval(CtxRef, <<"service.increment()">>, #{}, Env),
{ok, 2} = py_nif:context_eval(CtxRef, <<"service.increment()">>, #{}, Env).
```
### Sharing Context, Isolating State
Multiple Erlang processes can share an OWN_GIL context while maintaining isolated namespaces:
```erlang
%% Shared OWN_GIL context
{ok, Ctx} = py_context:start_link(1, owngil),
CtxRef = py_context:get_nif_ref(Ctx),
%% Process A - its own namespace
spawn(fun() ->
{ok, EnvA} = py_context:create_local_env(Ctx),
ok = py_nif:context_exec(CtxRef, <<"x = 'from A'">>, EnvA),
{ok, <<"from A">>} = py_nif:context_eval(CtxRef, <<"x">>, #{}, EnvA)
end),
%% Process B - separate namespace, same context
spawn(fun() ->
{ok, EnvB} = py_context:create_local_env(Ctx),
ok = py_nif:context_exec(CtxRef, <<"x = 'from B'">>, EnvB),
{ok, <<"from B">>} = py_nif:context_eval(CtxRef, <<"x">>, #{}, EnvB)
end).
%% Both execute in parallel on the same OWN_GIL thread, but with isolated state
```
### When to Use Explicit vs Implicit Environments
| Approach | API | Use Case |
|----------|-----|----------|
| **Implicit** | `py:exec/eval/call` | Simple cases, automatic management |
| **Explicit** | `create_local_env` + `py_nif:context_*` | OWN_GIL, fine-grained control, multiple envs per process |
**Use implicit (py:exec)** when:
- Using worker or subinterp modes
- One environment per process is sufficient
- You want automatic lifecycle management
**Use explicit (create_local_env)** when:
- Using OWN_GIL mode for parallel execution
- Need multiple environments in a single process
- Want to pass environments between processes
- Need direct NIF-level control
## Event Loop Environments
The event loop API also supports per-process namespaces. Each Erlang process gets an isolated namespace within the event loop, allowing you to define functions and state that persist across async task calls.
### Defining Functions for Async Tasks
```erlang
%% Get the event loop reference
{ok, Loop} = py_event_loop:get_loop(),
LoopRef = py_event_loop:get_nif_ref(Loop),
%% Define a function in this process's namespace
ok = py_nif:event_loop_exec(LoopRef, <<"
import asyncio
async def my_async_function(x):
await asyncio.sleep(0.1)
return x * 2
counter = 0
async def increment_and_get():
global counter
counter += 1
return counter
">>),
%% Call the function via create_task - uses __main__ module
{ok, Ref} = py_event_loop:create_task(Loop, '__main__', my_async_function, [21]),
{ok, 42} = py_event_loop:await(Ref),
%% State persists across calls
{ok, Ref1} = py_event_loop:create_task(Loop, '__main__', increment_and_get, []),
{ok, 1} = py_event_loop:await(Ref1),
{ok, Ref2} = py_event_loop:create_task(Loop, '__main__', increment_and_get, []),
{ok, 2} = py_event_loop:await(Ref2).
```
### Evaluating Expressions
```erlang
%% Evaluate expressions in the process's namespace
{ok, 42} = py_nif:event_loop_eval(LoopRef, <<"21 * 2">>),
%% Access variables defined via exec
ok = py_nif:event_loop_exec(LoopRef, <<"result = 'computed'">>),
{ok, <<"computed">>} = py_nif:event_loop_eval(LoopRef, <<"result">>).
```
### Process Isolation
Different Erlang processes have isolated event loop namespaces:
```erlang
{ok, Loop} = py_event_loop:get_loop(),
LoopRef = py_event_loop:get_nif_ref(Loop),
%% Process A defines x
spawn(fun() ->
ok = py_nif:event_loop_exec(LoopRef, <<"x = 'A'">>),
{ok, <<"A">>} = py_nif:event_loop_eval(LoopRef, <<"x">>)
end),
%% Process B has its own x
spawn(fun() ->
ok = py_nif:event_loop_exec(LoopRef, <<"x = 'B'">>),
{ok, <<"B">>} = py_nif:event_loop_eval(LoopRef, <<"x">>)
end).
```
### Cleanup
Event loop namespaces are automatically cleaned up when the Erlang process exits. The event loop monitors each process that creates a namespace and removes it on process termination.
### Automatic Env Reuse with py:exec
Functions defined via `py:exec(Ctx, Code)` can be called directly using the async task API (`py_event_loop:run/3,4`, `create_task/3,4`, `spawn_task/3,4`). The process-local environment is automatically detected and used for function lookup.
```erlang
%% Create a context and define an async function
Ctx = py:context(1),
ok = py:exec(Ctx, <<"
import asyncio
async def process_data(items):
results = []
for item in items:
await asyncio.sleep(0.001)
results.append(item * 2)
return results
">>),
%% Call it directly - env is reused automatically
{ok, [2,4,6]} = py_event_loop:run('__main__', process_data, [[1,2,3]]).
%% Also works with create_task and spawn_task
Ref = py_event_loop:create_task('__main__', process_data, [[4,5,6]]),
{ok, [8,10,12]} = py_event_loop:await(Ref).
%% Fire-and-forget
ok = py_event_loop:spawn_task('__main__', process_data, [[7,8,9]]).
```
This eliminates the need to manually pass environment references when calling functions defined in the process-local namespace.
## Building Python Actors
The process-bound model enables a pattern we call "Python actors" - Erlang processes that encapsulate Python state and expose it through message passing.
### Basic Actor Pattern
```erlang
-module(py_counter).
-behaviour(gen_server).
-export([start_link/0, increment/1, decrement/1, get/1]).
-export([init/1, handle_call/3, handle_cast/2]).
start_link() ->
gen_server:start_link(?MODULE, [], []).
increment(Pid) -> gen_server:call(Pid, increment).
decrement(Pid) -> gen_server:call(Pid, decrement).
get(Pid) -> gen_server:call(Pid, get).
init([]) ->
Ctx = py:context(),
ok = py:exec(Ctx, <<"
class Counter:
def __init__(self):
self.value = 0
def increment(self):
self.value += 1
return self.value
def decrement(self):
self.value -= 1
return self.value
def get(self):
return self.value
counter = Counter()
">>),
{ok, #{ctx => Ctx}}.
handle_call(increment, _From, #{ctx := Ctx} = State) ->
{ok, Value} = py:eval(Ctx, <<"counter.increment()">>),
{reply, Value, State};
handle_call(decrement, _From, #{ctx := Ctx} = State) ->
{ok, Value} = py:eval(Ctx, <<"counter.decrement()">>),
{reply, Value, State};
handle_call(get, _From, #{ctx := Ctx} = State) ->
{ok, Value} = py:eval(Ctx, <<"counter.get()">>),
{reply, Value, State}.
handle_cast(_Msg, State) ->
{noreply, State}.
```
Usage:
```erlang
{ok, Counter} = py_counter:start_link(),
1 = py_counter:increment(Counter),
2 = py_counter:increment(Counter),
1 = py_counter:decrement(Counter),
1 = py_counter:get(Counter).
```
### Reset via Process Termination
Following Erlang's "let it crash" philosophy, resetting Python state is simple:
```erlang
%% Supervise the Python actor
init([]) ->
Children = [
#{id => py_worker,
start => {py_worker, start_link, []},
restart => permanent}
],
{ok, {#{strategy => one_for_one}, Children}}.
%% To reset: just terminate and let supervisor restart
reset_worker(Sup) ->
ok = supervisor:terminate_child(Sup, py_worker),
{ok, _} = supervisor:restart_child(Sup, py_worker).
```
No need to manually clear variables, reload modules, or reset interpreter state. The new process starts with a fresh Python environment.
### Stateful ML Pipeline
```erlang
-module(ml_predictor).
-behaviour(gen_server).
-export([start_link/1, predict/2]).
-export([init/1, handle_call/3, terminate/2]).
start_link(ModelPath) ->
gen_server:start_link(?MODULE, ModelPath, []).
predict(Pid, Features) ->
gen_server:call(Pid, {predict, Features}).
init(ModelPath) ->
Ctx = py:context(),
%% Define functions and load model - stored in process-bound environment
ok = py:exec(Ctx, <<"
import pickle
_model = None
def load_model(path):
global _model
with open(path, 'rb') as f:
_model = pickle.load(f)
return True
def predict(features):
return _model.predict([features]).tolist()[0]
">>),
%% Load model - it's stored in _model global within this process's env
{ok, true} = py:call(Ctx, '__main__', load_model, [ModelPath]),
{ok, #{ctx => Ctx}}.
handle_call({predict, Features}, _From, #{ctx := Ctx} = State) ->
{ok, Result} = py:call(Ctx, '__main__', predict, [Features]),
{reply, {ok, Result}, State}.
terminate(_Reason, _State) ->
%% Python environment automatically cleaned up
ok.
```
## Advantages
| Aspect | Benefit |
|--------|---------|
| **Isolation** | Processes cannot interfere with each other's Python state |
| **Cleanup** | No resource leaks - process death = environment cleanup |
| **Restart** | Fresh state by terminating process (no manual reset logic) |
| **Supervision** | OTP supervisors manage Python actors like any other process |
| **Debugging** | Process dictionary inspection shows environment reference |
| **Memory** | Each process's Python memory counted separately |
## Trade-offs
| Aspect | Consideration |
|--------|---------------|
| **Memory overhead** | Each process has separate Python dict; no sharing |
| **Startup cost** | Environment created on first call per process |
| **No shared state** | State sharing requires explicit message passing or ETS |
| **Module caching** | Imported modules cached at interpreter level, not per-process |
## When to Use
**Good fit:**
- Stateful services (sessions, connections, workflows)
- Actor-style Python components
- Isolated workers that may need reset
- Per-request processing with state accumulation
- Supervised Python services
**Consider alternatives when:**
- Sharing state between many processes (use ETS or message passing)
- State must survive process restarts (use external storage)
- Memory is constrained (many processes = many environments)
- Truly stateless operations (environment overhead unnecessary)
## Comparison with Other Models
### vs. Global Interpreter State
Traditional Python embedding shares state globally. Any code can modify any variable. Isolation requires explicit namespace management.
With process-bound environments:
```erlang
%% Each process is automatically isolated
spawn(fun() -> py:exec(Ctx, <<"x = 1">>) end),
spawn(fun() -> py:exec(Ctx, <<"x = 2">>) end).
%% No conflict - different environments
```
### vs. Multiple Interpreters
Some systems create separate Python interpreters per "session". This provides isolation but:
- High memory cost per interpreter
- GIL contention in multi-interpreter setups
- Complex lifecycle management
Process-bound environments use a single interpreter (or subinterpreter pool) but isolate at the namespace level - lightweight and efficient.
### vs. Stateless Lambda-Style
Some systems treat Python as pure functions with no state between calls:
```erlang
%% Stateless style - no persistence
py:call(math, sqrt, [16]).
```
Process-bound environments allow both stateless and stateful patterns in the same system.
## Technical Details
Environments are stored as NIF resources with the following lifecycle:
1. **Creation**: First `py:exec/eval/call` in a process allocates an environment
2. **Storage**: Reference kept in process dictionary under `py_local_env`
3. **Usage**: Each call uses the environment as local namespace
4. **Cleanup**: NIF resource destructor runs when process terminates
For subinterpreters, environments are created inside the target interpreter using its memory allocator - critical for memory safety.
### Interpreter ID Validation
Each `py_env_resource_t` stores the Python interpreter ID (`interp_id`) when created. For OWN_GIL contexts, before any operation using a process-local env, the system validates that the env belongs to the current interpreter:
```c
PyInterpreterState *current_interp = PyInterpreterState_Get();
if (penv->interp_id != PyInterpreterState_GetID(current_interp)) {
return {error, env_wrong_interpreter};
}
```
This prevents:
- Using an env from a destroyed interpreter (dangling pointer)
- Using an env created for a different OWN_GIL context
- Memory corruption from cross-interpreter dict access
### Cleanup Safety
For the main interpreter (`interp_id == 0`), the destructor acquires the GIL and decrefs the Python dicts normally.
For subinterpreters, the destructor skips `Py_DECREF` because:
1. `PyGILState_Ensure` cannot safely acquire a subinterpreter's GIL
2. The Python objects will be freed when the subinterpreter is destroyed via `Py_EndInterpreter`
This design prioritizes safety over avoiding minor memory leaks during edge cases.
## See Also
- [OWN_GIL Internals](owngil_internals.md) - Architecture and safety mechanisms for OWN_GIL mode
- [Scalability](scalability.md) - Mode comparison (owngil vs subinterp vs worker)
- [Event Loop Architecture](event_loop_architecture.md) - Per-process namespace management
- [Context Affinity](context-affinity.md) - Context binding and routing
- [Scheduling](asyncio.md) - Cooperative scheduling for long operations