README.md

# FastCi - Elixir

Client for the FastCI test-orchestration cloud service.

## Top-level

This elixir application is a test-orchestator // semaphore. It starts ExUnit in a state where it won't run tests automatically and establishes a connection with a remote management server. When it receives a specific start-signal -- it runs them.

## Usage as a client

Add the package as a dependency as private git repository:

```elixir
def deps do
  [
    {:fast_ci, git: "git@github.com:RubyCI/fast_ci_elixir.git", branch: "master", only: :test}
  ]
end
```

> [!NOTE]
> SSH access to GitHub needs to be set appropriately on the system.

And the setup the `config/test.exs` file:

```elixir
secret_key = System.get_env("FAST_CI_SECRET_KEY") || 
  raise """
  FAST_CI_SECRET_KEY needs to be defined!
  """

config :ex_unit,
  autorun: false

config :fast_ci,
  env: config_env(),
  server: System.get_env("FAST_CI_SERVER", "wss://apx.ruby.ci:443"),
  secret_key: secret_key

# Not required but it's nice to see what's happening.
config :logger,
  level: :debug
```

> [!IMPORTANT]
> The port needs to be specified. Even though it's SSL encypted, it needs to have 443.

And then run the tests with:

```bash
mix fast_ci.test
```

Upon every test being completed without an issue, the exit code `0` is used. If an error does happen, the error code `1` is used and a message is written to `:stderr`.

> [!TIP]
> For easy development, you can run `bin/benchmark <number_of_nodes>` which will start mutliple connections at the same time and generate a fake, random, git hash so you don't need to do fake commits during development.

## Developer notes

### Logic workflow for ExUnit + WebSocket

1) Start ExUnit with autostart disabled. Ideally this is done via a config for the test environment, so that the client doesn't have to redefine every `ExUnit.start/0` function in their test helpers.
2) Load the main application modules and start the main application with `Mix.Task.run("app.start")`. If this isn't ran, `fast_ci` won't be able to see the main app modules when it's used as a dependency. This doesn't load the test modules, just the main app modules.
3) Load the test modules. That has to be called with `Code.require_file/1`. This is already implemented in the [`test.ex`](./lib/fast_ci/mix/tasks/test.ex) task file. This traverses through the `test` directory and loads every `.exs` and `.ex` file. The `.ex` static files need to be loaded first though.
4) Get the `_time` variable from `ExUnit.Server.modules_loaded/0 : /1` (sometimes it's `/0` and sometimes it's `/1`, dependent on the version of ExUnit). There isn't a reason for us specifically to use this, it's just the data-flow of ExUnit that requires this to be called before anything is done, we really don't care about the result. `apply/3` should be used to call this, to surpress the warning messages.
5) Retrieve the async and sync test modules with `ExUnit.Server.take_async_modules/1` and `ExUnit.Server.take_sync_modules/0` functions respectively. This is just an array of modules atoms. The order of calling matters here and the async modules need to be read first.
6) Start a new websocket connection with the scheduling server.
7) Send the test requests over to the scheduling server and await for a response with the string value of a module name.
8) On a server response with a test request, convert the string to the required module name via: `String.to_existing_atom("Elixir." <> "module_name")` and queue it into the ExUnit server via:  `ExUnit.Server.add_sync_module/1`, where the argument is the atomic name of the module.
9) Run the ExUnit tests with: `ExUnit.Runner.run(ExUnit.configuration(), ExUnit.Server.modules_loaded())`. The configuration argument should be made to be overriden. TODO: Check if the function call gives the configured variables or just the main ones.
10) Send the test result to the server and repeat until it's finished.

## WebSocket communication details

The whole communication flow can be seen with a `mix fast_ci.test` command and having all logs enabled as `deebug`. The following examples aren't in JSON per se, but rather in the native data-style of the application. The actual data being send to the server is just JSON encoded.

### Heartbeat

Periodically send the phoenix heartbeat

```elixir
%{ref: 0, payload: %{}, event: "heartbeat", topic: "phoenix"}
```

### Joining topics

To join the phoenix topic the following message needs to be sent. The topic is shared accross the nodes and is unique to the run:

```elixir
%{ref: 1, payload: %{}, event: "phx_join", topic: "test_orchestrator:ex_unit-0891fba"}
```

After that receive the following message:

```elixir
%{
  "event" => "phx_reply",
  "payload" => %{
    "response" => %{
      "event" => "join",
      "node_index" => 1,
      "state" => "running"
    },
    "status" => "ok"
  },
  "ref" => 1,
  "topic" => "test_orchestrator:ex_unit-0891fba"
}
```

### Enqueing requests

In case the node index is 0 when joining a topic, that node should send the enq request.

```elixir
%{
  ref: 2,
  payload: %{
    tests: %{
      "Elixir.FastCi.Subdir1.Test#sync" => %{
        :file_status => "pending",
        :test_count => 1,
        :test_counters => %FastCi.Structs.Test.Counters{
          failed: 0,
          passed: 0,
          pending: 0
        },
        "1" => %{
          "status" => "pending",
          :run_time => 0.0
        }
      },
      "Elixir.FastCi.Subdir2.Test#async" => %{
        :file_status => "pending",
        :test_count => 1,
        :test_counters => %FastCi.Structs.Test.Counters{
          failed: 0,
          passed: 0,
          pending: 0
        },
        "1" => %{
          "status" => "pending",
          :run_time => 0.0
        }
      },
      ...
    }
  },
  event: "enq",
  topic: "test_orchestrator:ex_unit-0891fba"
}

```

The `"1" =>` data is required as dummy data, otherwise the server will have an error. This shouldn't give a response if it's okay.

### Dequeue/run tests

So to be able to run the tests, a node has to send a dequeue event to the server, essentially saying: Hey, I can do stuff now!

```elixir
%{ref: 3, payload: %{}, event: "deq", topic: "test_orchestrator:ex_unit-0891fba"}
```

After that, a response should be given:

```elixir
%{
  "event" => "phx_reply",
  "payload" => %{
    "response" => %{
      "event" => "deq",
      "tests" => [
        "Elixir.FastCi.Subdir2.Test#async",
        "Elixir.FastCi.Subdir5.Test#sync",
        "Elixir.FastCiTest#async"
      ]
    },
    "status" => "ok"
  },
  "ref" => 3,
  "topic" => "test_orchestrator:ex_unit-0891fba"
}
```

If the tests are empty, that means the node has done everything it should and can be closed.

Once the tests are completed, the following message needs to be sent which has the results of the tests being ran.

```elixir
%{
  ref: 4,
  payload: %{
    "Elixir.FastCi.Subdir2.Test#async" => %{
      :file_status => "passed",
      :test_count => 1,
      :test_counters => %FastCi.Structs.Test.Counters{
        failed: 0,
        passed: 1,
        pending: 0
      },
      "1" => %{
        "status" => "passed",
        :run_time => 6.555
      }
    },
    "Elixir.FastCi.Subdir5.Test#sync" => %{
      :file_status => "passed",
      :test_count => 1,
      :test_counters => %FastCi.Structs.Test.Counters{
        failed: 0,
        passed: 1,
        pending: 0
      },
      "1" => %{
        "status" => "passed",
        :run_time => 6.512
      }
    },
    "Elixir.FastCiTest#async" => %{
      :file_status => "passed",
      :test_count => 1,
      :test_counters => %FastCi.Structs.Test.Counters{
        failed: 0,
        passed: 1,
        pending: 0
      },
      "1" => %{
        "status" => "passed",
        :run_time => 6.511
      }
    }
  },
  event: "deq",
  topic: "test_orchestrator:ex_unit-0891fba"
}
```

### Leaving the topic

To indicate the node being closed, this message needs to be sent:

```elixir
 %{ref: 5, payload: %{}, event: "leave", topic: "test_orchestrator:ex_unit-0891fba"}
 ```