guides/dirty-schedulers.md

# Dirty Schedulers

This guide covers dirty schedulers and long-running NIF operations in C3nif.

## The 1ms Rule

Regular NIFs must complete quickly (typically under 1 millisecond) to avoid blocking the Erlang scheduler. The BEAM runs multiple Erlang processes on a fixed number of scheduler threads, and a long-running NIF blocks one of those threads.

**Problem**: A NIF that takes 100ms blocks ~100 other process reductions from happening.

**Solutions**:
1. **Dirty schedulers** - Run on separate thread pools
2. **Yielding NIFs** - Split work into chunks
3. **Async threads** - Run work in a separate thread and send result

## Dirty Scheduler Types

The BEAM provides two dirty scheduler pools:

| Type | Use Case | Pool Size |
|------|----------|-----------|
| CPU-bound | Compute-intensive work (crypto, compression, ML) | Usually = CPU cores |
| I/O-bound | Blocking I/O (file ops, network, syscalls) | Usually = 10 |

## Static Dirty NIF Declaration

The simplest approach: declare the NIF as dirty at compile time:

```c3
<* nif: arity = 1, dirty = cpu *>
fn ErlNifTerm heavy_compute(
    ErlNifEnv* raw_env,
    CInt argc,
    ErlNifTerm* argv
) {
    Env e = env::wrap(raw_env);
    // This always runs on a dirty CPU scheduler
    // ...expensive computation...
    return term::make_int(&e, result).raw();
}

<* nif: arity = 1, dirty = io *>
fn ErlNifTerm blocking_io(
    ErlNifEnv* raw_env,
    CInt argc,
    ErlNifTerm* argv
) {
    Env e = env::wrap(raw_env);
    // This always runs on a dirty I/O scheduler
    // ...blocking I/O operation...
    return term::make_int(&e, result).raw();
}
```

### Annotation Options

| Annotation | Scheduler Type |
|------------|----------------|
| `dirty = cpu` | Dirty CPU-bound scheduler |
| `dirty = io` | Dirty I/O-bound scheduler |
| (none) | Normal scheduler |

## Dynamic Scheduling

Sometimes you want to decide at runtime whether to use a dirty scheduler:

```c3
import c3nif::scheduler;

<* nif: arity = 1 *>
fn ErlNifTerm process_data(
    ErlNifEnv* raw_env,
    CInt argc,
    ErlNifTerm* argv
) {
    Env e = env::wrap(raw_env);
    Term arg = term::wrap(argv[0]);

    // Check data size
    ErlNifBinary? bin = arg.inspect_binary(&e);
    if (catch err = bin) {
        return term::make_badarg(&e).raw();
    }

    if (bin.size > 1024 * 1024) {
        // Large data - schedule on dirty CPU
        return scheduler::schedule_dirty_cpu(
            &e,
            "process_data_impl",
            &process_data_impl,
            argc,
            argv
        ).raw();
    }

    // Small data - process directly
    return do_process(&e, &bin).raw();
}
```

### schedule_nif Variants

```c3
// Schedule on dirty CPU scheduler
scheduler::schedule_dirty_cpu(&e, "name", &func, argc, argv)

// Schedule on dirty I/O scheduler
scheduler::schedule_dirty_io(&e, "name", &func, argc, argv)

// Schedule on normal scheduler (switch back from dirty)
scheduler::schedule_normal(&e, "name", &func, argc, argv)

// Generic with flags
scheduler::schedule_nif(&e, "name", flags, &func, argc, argv)
// where flags is: SCHED_NORMAL, SCHED_CPU_BOUND, or SCHED_IO_BOUND
```

## Thread Type Detection

Check which scheduler type you're running on:

```c3
import c3nif::scheduler;

<* nif: arity = 0 *>
fn ErlNifTerm get_scheduler_type(
    ErlNifEnv* raw_env,
    CInt argc,
    ErlNifTerm* argv
) {
    Env e = env::wrap(raw_env);

    ThreadType t = scheduler::current_thread_type();

    char* name;
    switch (t) {
        case ThreadType.NORMAL:
            name = "normal";
        case ThreadType.DIRTY_CPU:
            name = "dirty_cpu";
        case ThreadType.DIRTY_IO:
            name = "dirty_io";
        default:
            name = "undefined";
    }

    return term::make_atom(&e, name).raw();
}
```

### Helper Functions

```c3
// Check if on dirty scheduler
if (scheduler::is_dirty_scheduler()) {
    // Running on dirty CPU or I/O scheduler
}

// Check if on normal scheduler
if (scheduler::is_normal_scheduler()) {
    // Running on normal scheduler
}
```

## Process Liveness

On dirty schedulers, the calling process can terminate while the NIF runs:

```c3
<* nif: arity = 1, dirty = cpu *>
fn ErlNifTerm long_computation(
    ErlNifEnv* raw_env,
    CInt argc,
    ErlNifTerm* argv
) {
    Env e = env::wrap(raw_env);

    for (int i = 0; i < 1000000; i++) {
        // Periodically check if process is still alive
        if (i % 10000 == 0) {
            if (!scheduler::is_process_alive(&e)) {
                // Process terminated - abort early
                return term::make_atom(&e, "process_terminated").raw();
            }
        }

        // ... do work ...
    }

    return term::make_int(&e, result).raw();
}
```

### What Happens When a Process Dies

When the calling process terminates during a dirty NIF:
1. Links and monitors are triggered
2. The registered name is released
3. ETS tables are cleaned up
4. **The NIF continues to execute**

Always check `is_process_alive()` in long-running dirty NIFs to avoid wasted work.

## Timeslice Consumption

For normal schedulers, consume timeslices to cooperate with the scheduler:

```c3
fn ErlNifTerm cooperative_nif(
    ErlNifEnv* raw_env,
    CInt argc,
    ErlNifTerm* argv
) {
    Env e = env::wrap(raw_env);

    for (int i = 0; i < iterations; i++) {
        // Do a chunk of work
        process_chunk(i);

        // Report progress (1% per chunk)
        if (e.consume_timeslice(1)) {
            // Consumed too much time - should yield
            // For yielding NIFs, schedule continuation here
            break;
        }
    }

    return term::make_int(&e, result).raw();
}
```

The argument to `consume_timeslice` is a percentage (1-100) of a timeslice.
Returns `true` if the NIF has consumed enough time that it should yield.

## Yielding NIFs

For operations that can be split into chunks, yielding NIFs are preferred over dirty schedulers:

```c3
// Context stored in a resource (survives across yields)
struct ComputeContext {
    int current_index;
    int total;
    int result;
}

<* nif: arity = 1 *>
fn ErlNifTerm start_compute(
    ErlNifEnv* raw_env,
    CInt argc,
    ErlNifTerm* argv
) {
    Env e = env::wrap(raw_env);

    int? total = term::wrap(argv[0]).get_int(&e);
    if (catch err = total) {
        return term::make_badarg(&e).raw();
    }

    // Allocate context resource
    void*? ptr = resource::alloc(compute_context_type, ComputeContext.sizeof);
    if (catch err = ptr) {
        return term::make_error_atom(&e, "alloc_failed").raw();
    }

    ComputeContext* ctx = (ComputeContext*)ptr;
    ctx.current_index = 0;
    ctx.total = total;
    ctx.result = 0;

    // Create resource term
    Term ctx_term = resource::make_term(&e, ptr);
    resource::release(ptr);

    // Schedule the continuation with context as argument
    ErlNifTerm[1] new_argv = { ctx_term.raw() };
    return scheduler::schedule_normal(
        &e,
        "compute_chunk",
        &compute_chunk,
        1,
        &new_argv[0]
    ).raw();
}

fn ErlNifTerm compute_chunk(
    ErlNifEnv* raw_env,
    CInt argc,
    ErlNifTerm* argv
) {
    Env e = env::wrap(raw_env);

    void*? ptr = resource::get(compute_context_type, &e, term::wrap(argv[0]));
    if (catch err = ptr) {
        return term::make_badarg(&e).raw();
    }

    ComputeContext* ctx = (ComputeContext*)ptr;

    // Process a chunk
    int chunk_size = 1000;
    int end = ctx.current_index + chunk_size;
    if (end > ctx.total) {
        end = ctx.total;
    }

    for (int i = ctx.current_index; i < end; i++) {
        ctx.result += expensive_operation(i);

        // Optionally check timeslice
        if (i % 100 == 0 && e.consume_timeslice(1)) {
            ctx.current_index = i + 1;
            // Yield and continue later
            return scheduler::schedule_normal(
                &e,
                "compute_chunk",
                &compute_chunk,
                argc,
                argv
            ).raw();
        }
    }

    ctx.current_index = end;

    if (ctx.current_index < ctx.total) {
        // More work to do - yield
        return scheduler::schedule_normal(
            &e,
            "compute_chunk",
            &compute_chunk,
            argc,
            argv
        ).raw();
    }

    // Done - return result
    return term::make_int(&e, ctx.result).raw();
}
```

## Choosing the Right Approach

| Scenario | Recommendation |
|----------|----------------|
| < 1ms work | Normal NIF |
| Can split into chunks | Yielding NIF |
| CPU-bound, can't split | Dirty CPU scheduler |
| Blocking I/O | Dirty I/O scheduler |
| Needs to track partial progress | Yielding NIF with resource |

## Dirty Scheduler Limitations

Operations that work on dirty schedulers:
- All term creation/extraction functions
- Resource allocation and access
- Memory allocation (`allocator::*`)
- Message sending (`env::send`)
- Process monitoring

Things to be careful about:
- Process may terminate mid-execution
- GC is delayed until NIF returns
- Can't call ETS functions that would block

## Best Practices

1. **Prefer yielding NIFs** when work can be split - they're more cooperative

2. **Check process liveness** in long-running dirty NIFs

3. **Use CPU-bound for compute** (crypto, compression, math)

4. **Use I/O-bound for blocking** (file I/O, network, external processes)

5. **Don't mix scheduler types** in the same logical operation

6. **Profile before optimizing** - measure actual execution time

7. **Store continuation state in resources** - stack is invalid across yields

## Complete Example: Parallel Hash

```c3
module hash_nif;

import c3nif;
import c3nif::erl_nif;
import c3nif::env;
import c3nif::term;
import c3nif::scheduler;
import c3nif::binary;

// Hash a large binary - uses dirty CPU scheduler
<* nif: arity = 1, dirty = cpu *>
fn ErlNifTerm hash_binary(
    ErlNifEnv* raw_env,
    CInt argc,
    ErlNifTerm* argv
) {
    Env e = env::wrap(raw_env);

    Binary? bin = binary::inspect(&e, term::wrap(argv[0]));
    if (catch err = bin) {
        return term::make_badarg(&e).raw();
    }

    // Check process liveness for large binaries
    if (bin.size > 10 * 1024 * 1024) {
        // > 10MB - check periodically
        ulong hash = 0;
        char[] data = bin.as_slice();

        for (usz i = 0; i < bin.size; i++) {
            hash = hash * 31 + (ulong)data[i];

            if (i % (1024 * 1024) == 0) {  // Every 1MB
                if (!scheduler::is_process_alive(&e)) {
                    return term::make_atom(&e, "aborted").raw();
                }
            }
        }

        return term::make_ulong(&e, hash).raw();
    }

    // Small binary - just hash it
    ulong hash = compute_hash(bin.as_slice());
    return term::make_ulong(&e, hash).raw();
}

fn ulong compute_hash(char[] data) {
    ulong hash = 0;
    for (usz i = 0; i < data.len; i++) {
        hash = hash * 31 + (ulong)data[i];
    }
    return hash;
}
```