docs/type-conversion.md

# Type Conversion

This guide details how values are converted between Erlang and Python.

## Erlang to Python

When calling Python functions or evaluating expressions, Erlang values are automatically converted:

| Erlang | Python | Notes |
|--------|--------|-------|
| `integer()` | `int` | Arbitrary precision supported |
| `float()` | `float` | IEEE 754 double precision |
| `binary()` | `str` | UTF-8 encoded |
| `{bytes, binary()}` | `bytes` | Explicit bytes (no UTF-8 decode) |
| `atom()` | `str` | Converted to string (except special atoms) |
| `true` | `True` | Boolean |
| `false` | `False` | Boolean |
| `none` | `None` | Null value |
| `nil` | `None` | Null value (Elixir compatibility) |
| `undefined` | `None` | Null value |
| `list()` | `list` | Recursively converted |
| `tuple()` | `tuple` | Recursively converted |
| `map()` | `dict` | Keys and values recursively converted |
| `pid()` | `erlang.Pid` | Opaque wrapper, round-trips back to Erlang PID |

### Examples

```erlang
%% Integers
py:call(mymod, func, [42]).           %% Python receives: 42
py:call(mymod, func, [123456789012345678901234567890]).  %% Big integers work

%% Floats
py:call(mymod, func, [3.14159]).      %% Python receives: 3.14159

%% Strings (binaries)
py:call(mymod, func, [<<"hello">>]).  %% Python receives: "hello"

%% Atoms become strings
py:call(mymod, func, [foo]).          %% Python receives: "foo"

%% Booleans
py:call(mymod, func, [true, false]).  %% Python receives: True, False

%% None equivalents
py:call(mymod, func, [none]).         %% Python receives: None
py:call(mymod, func, [nil]).          %% Python receives: None
py:call(mymod, func, [undefined]).    %% Python receives: None

%% Lists
py:call(mymod, func, [[1, 2, 3]]).    %% Python receives: [1, 2, 3]

%% Tuples
py:call(mymod, func, [{1, 2, 3}]).    %% Python receives: (1, 2, 3)

%% Maps become dicts
py:call(mymod, func, [#{a => 1, b => 2}]).  %% Python receives: {"a": 1, "b": 2}
```

### Explicit Bytes Conversion

By default, Erlang binaries are converted to Python `str` using UTF-8 decoding.
To explicitly send raw bytes without string conversion, use the `{bytes, Binary}` tuple:

```erlang
%% Default: binary -> str
py:call(mymod, func, [<<"hello">>]).  %% Python sees: "hello" (str)

%% Explicit: {bytes, binary} -> bytes
py:call(mymod, func, [{bytes, <<"hello">>}]).  %% Python sees: b"hello" (bytes)

%% Useful for binary protocols, images, compressed data
py:call(image_processor, load, [{bytes, ImageData}]).
```

This is useful when you need to ensure binary data is treated as raw bytes in Python,
for example when working with binary protocols, image data, or compressed content.

Note that on the return path, both Python `str` and `bytes` become Erlang `binary()`:

```erlang
%% Python str -> Erlang binary
{ok, <<"hello">>} = py:eval(<<"'hello'">>).

%% Python bytes -> Erlang binary
{ok, <<"hello">>} = py:eval(<<"b'hello'">>).

%% Non-UTF8 bytes also work
{ok, <<255, 254>>} = py:eval(<<"b'\\xff\\xfe'">>).
```

## Python to Erlang

Return values from Python are converted back to Erlang:

| Python | Erlang | Notes |
|--------|--------|-------|
| `int` | `integer()` or `string()` | Integers > 64 bits returned as strings |
| `float` | `float()` | IEEE 754 double precision |
| `float('nan')` | `nan` | Atom for Not-a-Number |
| `float('inf')` | `infinity` | Atom for positive infinity |
| `float('-inf')` | `neg_infinity` | Atom for negative infinity |
| `str` | `binary()` | UTF-8 encoded |
| `bytes` | `binary()` | Raw bytes |
| `True` | `true` | Boolean |
| `False` | `false` | Boolean |
| `None` | `none` | Null value |
| `list` | `list()` | Recursively converted |
| `tuple` | `tuple()` | Recursively converted |
| `dict` | `map()` | Keys and values recursively converted |
| `erlang.Pid` | `pid()` | Round-trips back to the original Erlang PID |
| generator | internal | Used with streaming functions |

### Examples

```erlang
%% Integers
{ok, 42} = py:eval(<<"42">>).

%% Big integers (> 64 bits) are returned as strings
{ok, "123456789012345678901234567890"} = py:eval(<<"123456789012345678901234567890">>).

%% Floats
{ok, 3.14} = py:eval(<<"3.14">>).

%% Special floats
{ok, nan} = py:eval(<<"float('nan')">>).
{ok, infinity} = py:eval(<<"float('inf')">>).
{ok, neg_infinity} = py:eval(<<"float('-inf')">>).

%% Strings
{ok, <<"hello">>} = py:eval(<<"'hello'">>).

%% Bytes
{ok, <<72,101,108,108,111>>} = py:eval(<<"b'Hello'">>).

%% Booleans
{ok, true} = py:eval(<<"True">>).
{ok, false} = py:eval(<<"False">>).

%% None
{ok, none} = py:eval(<<"None">>).

%% Lists
{ok, [1, 2, 3]} = py:eval(<<"[1, 2, 3]">>).

%% Tuples
{ok, {1, 2, 3}} = py:eval(<<"(1, 2, 3)">>).

%% Dicts become maps
{ok, #{<<"a">> := 1, <<"b">> := 2}} = py:eval(<<"{'a': 1, 'b': 2}">>).
```

### Process Identifiers (PIDs)

Erlang PIDs are converted to opaque `erlang.Pid` objects in Python. These can be
passed back to Erlang (where they become real PIDs again) or used with `erlang.send()`:

```erlang
%% Pass self() to Python - arrives as erlang.Pid
{ok, Pid} = py:call(mymod, round_trip_pid, [self()]).
%% Pid =:= self()

%% Python can send messages directly to Erlang processes
ok = py:exec(<<"
import erlang
def notify(pid, data):
    erlang.send(pid, ('notification', data))
">>).
```

```python
import erlang

def forward_to(pid, message):
    """Send a message to an Erlang process."""
    erlang.send(pid, message)
```

`erlang.Pid` objects support equality and hashing, so they can be compared and
used as dict keys or in sets:

```python
pid_a == pid_b       # True if both wrap the same Erlang PID
{pid: "value"}       # Works as a dict key
pid in seen_pids     # Works in sets
```

Sending to a process that has already exited raises `erlang.ProcessError`.

## Special Cases

### NumPy Arrays

NumPy arrays are converted to nested Erlang lists:

```erlang
%% 1D array
{ok, [1.0, 2.0, 3.0]} = py:eval(<<"import numpy as np; np.array([1, 2, 3]).tolist()">>).

%% 2D array
{ok, [[1, 2], [3, 4]]} = py:eval(<<"import numpy as np; np.array([[1,2],[3,4]]).tolist()">>).
```

For best performance with large arrays, consider using `.tolist()` in Python before returning.

### Nested Structures

Nested data structures are recursively converted:

```erlang
%% Nested dict
{ok, #{<<"user">> := #{<<"name">> := <<"Alice">>, <<"age">> := 30}}} =
    py:eval(<<"{'user': {'name': 'Alice', 'age': 30}}">>).

%% List of tuples
{ok, [{1, <<"a">>}, {2, <<"b">>}]} = py:eval(<<"[(1, 'a'), (2, 'b')]">>).

%% Mixed nesting
{ok, #{<<"items">> := [1, 2, 3], <<"meta">> := {<<"ok">>, 200}}} =
    py:eval(<<"{'items': [1, 2, 3], 'meta': ('ok', 200)}">>).
```

### Map Keys

Erlang maps support any term as key, but Python dicts are more restricted:

```erlang
%% Erlang atom keys become Python strings
py:call(json, dumps, [#{foo => 1, bar => 2}]).
%% Python sees: {"foo": 1, "bar": 2}

%% Binary keys stay as strings
py:call(json, dumps, [#{<<"foo">> => 1}]).
%% Python sees: {"foo": 1}
```

When Python returns dicts, string keys become binaries:

```erlang
{ok, #{<<"foo">> := 1}} = py:eval(<<"{'foo': 1}">>).
```

### Keyword Arguments

Maps can be used for Python keyword arguments:

```erlang
%% Call with kwargs
{ok, Json} = py:call(json, dumps, [Data], #{indent => 2, sort_keys => true}).

%% Equivalent Python: json.dumps(data, indent=2, sort_keys=True)
```

## Unsupported Types

Some Python types cannot be directly converted:

| Python Type | Workaround |
|-------------|------------|
| `set` | Convert to list: `list(my_set)` |
| `frozenset` | Convert to tuple: `tuple(my_frozenset)` |
| `datetime` | Use `.isoformat()` or timestamp |
| `Decimal` | Use `float()` or `str()` |
| Custom objects | Implement `__iter__` or serialization |

### Example Workarounds

```erlang
%% Sets - convert to list in Python
{ok, [1, 2, 3]} = py:eval(<<"sorted(list({3, 1, 2}))">>).

%% Datetime - use ISO format
{ok, <<"2024-01-15T10:30:00">>} =
    py:eval(<<"from datetime import datetime; datetime(2024,1,15,10,30).isoformat()">>).

%% Decimal - convert to string for precision
{ok, <<"3.14159265358979323846">>} =
    py:eval(<<"from decimal import Decimal; str(Decimal('3.14159265358979323846'))">>).
```

## Performance Considerations

- **Large strings**: Binary conversion is efficient, but very large strings may cause memory pressure
- **Deep nesting**: Deeply nested structures require recursive traversal
- **Big integers**: Integers larger than 64 bits are returned as strings; convert with `list_to_integer/1` if needed
- **NumPy arrays**: Call `.tolist()` for explicit conversion; direct array conversion may be slower

For large data transfers, consider:
1. Using streaming for iterables
2. Serializing to JSON/msgpack in Python
3. Processing data in chunks