README.md

# ExEtlFramework

ExEtlFramework is a powerful and flexible ETL (Extract, Transform, Load) framework built in Elixir. It simplifies the process of creating robust data processing pipelines with built-in support for validation, error handling, and performance monitoring.

## Features

- **Modular Pipeline Structure**: Easy-to-define ETL steps using a simple DSL
- **Flexible Data Validation**: Schema-based validation with built-in and custom validators
- **Error Handling Strategies**: Choose between fail-fast or error collection approaches
- **Telemetry Integration**: Built-in performance measurement and reporting
- **Extensible**: Easy to add custom steps and validators

## Installation

Add `ex_etl_framework` to your list of dependencies in `mix.exs`:

```elixir
def deps do
  [
    {:ex_etl_framework, "~> 0.1.0"}
  ]
end
```

## Usage

### Defining a Pipeline

Create a module for your pipeline and use the `ExEtlFramework.Pipeline` macro:

```elixir
defmodule MyPipeline do
  use ExEtlFramework.Pipeline

  step :extract do
    # Extraction logic
    {:ok, %{data: [1, 2, 3]}}
  end

  step :transform do
    # Transformation logic
    {:ok, %{data: [2, 4, 6]}}
  end

  step :load do
    # Loading logic
    {:ok, %{result: "Data loaded successfully"}}
  end

  # Optional: Define validation for each step
  def validate_extract(data) do
    schema = %{
      data: [&ExEtlFramework.Validator.required/1, &ExEtlFramework.Validator.type(List)]
    }
    ExEtlFramework.Validator.validate(data, schema)
  end
end
```

### Running a Pipeline

Execute your pipeline with optional error handling strategy:

```elixir
result = MyPipeline.run(%{initial: "data"}, error_strategy: :collect_errors)
```

## Key Components

### Pipeline

The core module for defining and executing ETL steps. It provides:

- A DSL for defining pipeline steps
- Automatic error handling
- Integration with the validation system

### Validator

A flexible data validation system:

- Define validation schemas with built-in and custom validators
- Easy to use in pipeline steps
- Supports complex data structures

Example of a validation schema:

```elixir
schema = %{
  name: [&ExEtlFramework.Validator.required/1, &ExEtlFramework.Validator.type(String)],
  age: [&ExEtlFramework.Validator.type(Integer)],
  email: [&ExEtlFramework.Validator.required/1, &custom_email_validator/1]
}
```

### Telemetry Integration

Built-in performance monitoring using Telemetry:

- Automatically measures duration of pipeline runs and individual steps
- Tracks errors in pipelines
- Easy to integrate with your preferred monitoring solution

## Advanced Usage

### Custom Validators

Create custom validation functions:

```elixir
def custom_email_validator(value) do
  if String.contains?(value, "@") do
    :ok
  else
    {:error, "Invalid email format"}
  end
end
```

### Error Handling Strategies

Choose between two error handling strategies:

- `:fail_fast`: Stops the pipeline at the first error
- `:collect_errors`: Continues processing and collects all errors

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the MIT License.