README.md

# Gusex

**Elixir client for Polish Business Registry Service (GUS BIR)**
Gusex is an Elixir library that provides access to the Polish Central
Statistical Office (GUS) BIR (Baza Internetowa Regon) service. It enables
developers to programmatically retrieve data about business entities from Polish
registries including CEIDG, KRS, and RSPO, which are synchronised into the BIR
database.

## Features

- All officially exposed BIR service methods supported
- Search businesses by REGON, NIP, or KRS identifiers
- Retrieve full business reports and aggregate data
- Elixir structs for many data types
- Support for both test and production environments
- Uses the required SOAP 1.2 protocol with handcrafted MTOM support
- Extensive error handling and validation
- Optional session cache with TTL and transparent retry on expiry, plus a convenience lookup API on top of it

## Installation

If [available in Hex](https://hex.pm/docs/publish), the package can be installed
by adding `gusex` to your list of dependencies in `mix.exs`:

```elixir
def deps do
	[
		{:gusex, "~> 1.2"}
	]
end
```

Then run:
```bash
mix deps.get
```

## Configuration

```elixir
config :gusex,
	api_key: "your_production_api_key",
	environment: :production  # or :test
```

For testing, you can use the test environment:

```elixir
config :gusex,
	api_key: "abcde12345abcde12345",  # Test API key
	environment: :test
```

although it is default if not specified and in dev or test environment.

### Optional: session cache

The convenience API (see [Convenience Lookups](#convenience-lookups)) relies on
`Gusex.SessionCache`, a GenServer-backed cache that holds authenticated sessions
and reuses them across calls. It is opt-in — add it to your application's
supervision tree:

```elixir
defmodule MyApp.Application do
	use Application

	def start(_type, _args) do
		children = [
			Gusex.SessionCache,
			# ... other children
		]

		Supervisor.start_link(children, strategy: :one_for_one)
	end
end
```

The cache reads credentials from the same `:gusex` application config and
accepts one tunable — `:session_ttl` in seconds, default `3000` (50 minutes,
a safe margin below the service's 60-minute hard session limit):

```elixir
config :gusex,
	api_key: "your_production_api_key",
	environment: :production,
	session_ttl: 3000
```

The low-level API (`zaloguj/2`, `dane_szukaj_podmioty/2`, etc.) does not
require the cache and can be used directly with manually managed sessions.

## Quick Start

### Convenience API (requires `Gusex.SessionCache` in the supervision tree)

```elixir
# One-shot lookup by NIP - session is obtained from the cache, validated,
# and transparently refreshed if the server has invalidated it.
{:ok, entities} = Gusex.find_by_nip("5261040828")
```

### Low-level API (manual session management)

```elixir
# Login to the service and obtain a session (uses test environment by default; configure with valid API key for production environment)
{:ok, session} = Gusex.zaloguj()

# Search for entities by REGON (using ParametryWyszukiwania struct)
search_params = %Gusex.Types.ParametryWyszukiwania{Regon: "000331501"}
{:ok, entities} = Gusex.dane_szukaj_podmioty(search_params, session)
# or (using simple map with atom keys - error will be returned for invalid or non-atom keys)
{:ok, entities} = Gusex.dane_szukaj_podmioty(%{Regon: "000331501"}, session)

# Get detailed report for an entity
{:ok, report} = Gusex.dane_pobierz_pelny_raport("000331501", "BIR12OsPrawna", session)

# Logout
{:ok, true} = Gusex.wyloguj(session)
```

## API Reference

### Authentication

#### `zaloguj/2`

Authenticates with the BIR service and establishes a session.

```elixir
# Using configured or default (for GUS test environment) credentials
{:ok, session} = Gusex.zaloguj()

# Using explicit credentials and environment (you shouldn't do this in production code)
{:ok, session} = Gusex.zaloguj("your_api_key", :production)
```

**Returns:** `{:ok, %Gusex.Types.Session{}}` or `{:error, reason}`

### Entity Search

#### `dane_szukaj_podmioty/2`

Searches for business entities using various identifier types.

```elixir
# Search by REGON (9 or 14 digits)
search_params = %Gusex.Types.ParametryWyszukiwania{Regon: "000331501"}
{:ok, entities} = Gusex.dane_szukaj_podmioty(search_params, session)

# Search by NIP (Tax ID)
search_params = %Gusex.Types.ParametryWyszukiwania{Nip: "5260001246"}
{:ok, entities} = Gusex.dane_szukaj_podmioty(search_params, session)

# Search by KRS (Court Register number)
search_params = %Gusex.Types.ParametryWyszukiwania{Krs: "0000028860"}
{:ok, entities} = Gusex.dane_szukaj_podmioty(search_params, session)

# Multiple search (up to 20 entities)
search_params = %Gusex.Types.ParametryWyszukiwania{
	Regony9zn: "000331501,123456789",
	Nipy: "5260001246,1234567890"
}
{:ok, entities} = Gusex.dane_szukaj_podmioty(search_params, session)
```

**Returns:** `{:ok, [%Gusex.Types.Podmiot{}]}` or `{:error, error_data}`

### Detailed Reports

#### `dane_pobierz_pelny_raport/3`

Retrieves detailed information about a specific entity.

```elixir
# Legal entities (companies)
{:ok, company_data} = Gusex.dane_pobierz_pelny_raport("000331501", "BIR12OsPrawna", session)

# Natural persons - general data
{:ok, person_data} = Gusex.dane_pobierz_pelny_raport("12345678901234", "BIR12OsFizycznaDaneOgolne", session)

# Natural persons - CEIDG business activity
{:ok, ceidg_data} = Gusex.dane_pobierz_pelny_raport("12345678901234", "BIR12OsFizycznaDzialalnoscCeidg", session)

# Local units of legal entities
{:ok, local_unit} = Gusex.dane_pobierz_pelny_raport("12345678901234", "BIR12JednLokalnaOsPrawnej", session)
```

**Available Report Types:**

- `BIR12JednLokalnaOsFizycznej`
- `BIR12JednLokalnaOsFizycznejPkd`
- `BIR12JednLokalnaOsPrawnej`
- `BIR121JednLokalnaOsPrawnej` (extends `BIR12JednLokalnaOsPrawnej` with NIP and NIP status fields)
- `BIR12JednLokalnaOsPrawnejPkd`
- `BIR12OsFizycznaDaneOgolne`
- `BIR12OsFizycznaDzialalnoscCeidg`
- `BIR12OsFizycznaDzialalnoscPozostala`
- `BIR12OsFizycznaDzialalnoscRolnicza`
- `BIR12OsFizycznaDzialalnoscSkreslonaDo20141108`
- `BIR12OsFizycznaListaJednLokalnych`
- `BIR12OsFizycznaPkd`
- `BIR12OsPrawna`
- `BIR12OsPrawnaListaJednLokalnych`
- `BIR12OsPrawnaPkd`
- `BIR12OsPrawnaSpCywilnaWspolnicy`

### Aggregate Reports

#### `dane_pobierz_raport_zbiorczy/3`

Retrieves aggregate reports with lists of entities modified on a specific date.

```elixir
# New legal entities and individual business activities
{:ok, new_entities} = Gusex.dane_pobierz_raport_zbiorczy(
	"2014-01-15",
	"BIR11NowePodmiotyPrawneOrazDzialalnosciOsFizycznych",
	session
)
```

**Available Aggregate Reports:**

- `BIR11AktualizowaneJednostkiLokalne`
- `BIR11AktualizowanePodmiotyPrawneOrazDzialalnosciOsFizycznych`
- `BIR11NoweJednostkiLokalne`
- `BIR11NowePodmiotyPrawneOrazDzialalnosciOsFizycznych`
- `BIR11SkresloneJednostkiLokalne`
- `BIR11SkreslonePodmiotyPrawneOrazDzialalnosciOsFizycznych`

### Service Information

#### `get_value/2`

Retrieves service diagnostic information.

```elixir
# Get service status message (no session required)
{:ok, service_message} = Gusex.get_value("KomunikatUslugi")

# Get session status (requires active session)
{:ok, session_status} = Gusex.get_value("StatusSesji", session)
```

#### `wyloguj/1`

Terminates the active session.

```elixir
{:ok, true} = Gusex.wyloguj(session)
```

### Convenience Lookups

Higher-level helpers that obtain, cache, and refresh sessions automatically.
They require `Gusex.SessionCache` in the application's supervision tree — see
[Optional: session cache](#optional-session-cache) for setup.

#### `find_by_nip/1`

Finds a business entity by NIP using a cached session. Validates and
normalises the input (accepts common separators such as spaces and
hyphens), obtains a session from the cache (logging in if needed), and
transparently retries once if the call returns a retriable failure
(`:empty_response` or an HTTP-level error).

```elixir
# Clean 10-digit NIP
{:ok, entities} = Gusex.find_by_nip("5261040828")

# Separators are tolerated
{:ok, entities} = Gusex.find_by_nip("526-10-40-828")
```

**Returns:**

- `{:ok, [%Gusex.Types.Podmiot{}]}` - List of matching entities (typically one)
- `{:error, :invalid_nip}` - Input failed NIP checksum or format validation
- `{:error, :session_cache_not_started}` - `Gusex.SessionCache` is not in the supervision tree
- `{:error, reason}` - Login, search, or other failure

#### `Gusex.SessionCache.run_with_session/1`

Runs an arbitrary one-argument function with a cached session, handling
acquisition, TTL expiry, and a single retry on retriable errors. Use this
as a building block for additional cached-session operations beyond
`find_by_nip/1`:

```elixir
Gusex.SessionCache.run_with_session(fn session ->
	Gusex.dane_pobierz_pelny_raport("000331501", "BIR12OsPrawna", session)
end)
```

**Retriable errors** (one automatic retry with a fresh login):

- `:empty_response` - The empirical signal that the server has killed the session (GUS returns HTTP 200 with an empty body in that case)
- `{:http_error, _, _}` - Possibly transient, possibly session-related

All other errors are returned unchanged.

## Data Types and Structures

### Core Types

```elixir
# Session information
%Gusex.Types.Session{
	id: "abcdef1234567890abcd",
	environment: :test,
	endpoint_url: "https://wyszukiwarkaregontest.stat.gov.pl/wsBIR/UslugaBIRzewnPubl.svc"
}

# Search parameters
%Gusex.Types.ParametryWyszukiwania{
	Regon: "000331501",
	Nip: nil,
	Krs: nil,
	Regony9zn: nil,
	Regony14zn: nil,
	Nipy: nil,
	Krsy: nil
}

# Entity information
%Gusex.Types.Podmiot{
	Regon: "000331501",
	Nip: "5260001246",
	StatusNip: "",
	Nazwa: "GŁÓWNY URZĄD STATYSTYCZNY",
	Typ: "P",  # P = Legal entity, F = Natural person
	SilosID: "6"
}
```

### Entity Types

- **`P`** - Legal entity (Osoba prawna)
- **`F`** - Natural person conducting business activity
- **`LP`** - Local unit of legal entity
- **`LF`** - Local unit of natural person

### Activity Types (SilosID)

- **`1`** - CEIDG registered activity
- **`2`** - Agricultural activity
- **`3`** - Other business activity
- **`4`** - Activity deleted before 2014-11-08
- **`6`** - Legal entity activity

## Environment Setup

### Test Environment

- **Endpoint**: `https://wyszukiwarkaregontest.stat.gov.pl/wsBIR/UslugaBIRzewnPubl.svc`
- **WSDL**: `https://wyszukiwarkaregontest.stat.gov.pl/wsBIR/wsdl/UslugaBIRzewnPubl-ver11-test.wsdl`
- **API Key**: `abcde12345abcde12345` (public test key)
- **Data**: Partially anonymised data from November 8, 2014

### Production Environment

- **Registration**: Email `regon_bir@stat.gov.pl` with:

- Organisation name and REGON/NIP
- Contact person details
- Phone numbers
- Expected concurrent users
- IP addresses (if applicable)

- **Endpoint**: `https://wyszukiwarkaregon.stat.gov.pl/wsBIR/UslugaBIRzewnPubl.svc`

## Error Handling

The library provides comprehensive error handling for various scenarios:

```elixir
case Gusex.zaloguj("api_key", :production) do
	{:ok, session} ->
		# Success
		session
	{:error, :api_key_missing} ->
		# No API key provided or configured
		handle_missing_key()
	{:error, :login_failure} ->
		# Invalid credentials or service unavailable
		handle_login_failure()
	{:error, reason} ->
		# Other errors (network, parsing, etc.)
		handle_error(reason)
end
```

## Technical Notes

### Session Management

- Sessions automatically expire after 60 minutes
- Daily maintenance at 3:20 AM closes all active sessions
- When using `Gusex.SessionCache`, both cases are handled transparently: the TTL defaults to 50 minutes (a safe margin below the server's 60-minute hard limit), and `run_with_session/1` evicts the stale entry and re-authenticates on the next call when the server has invalidated a session mid-TTL (signalled by an empty response)

### Data Security

- Returned data may contain executable code (HTML/JavaScript/SQL)
- Always sanitise and validate data before displaying or storing
- Implement proper input filtering to prevent XSS/SQL injection

### Rate Limiting

- No automatic sequential ID scanning allowed
- Respect service limits and usage guidelines
- Maximum 20 entities per bulk search request

## Upgrading from 1.1.x

Version 1.2 tightens the success return contract of `dane_pobierz_pelny_raport/3`:
it now returns `{:ok, [struct()]}` instead of `{:ok, [map()]}`. The struct type
depends on `report_name` (e.g. `"BIR12OsPrawna"` -> `%Gusex.Types.OsPrawna{}`).
Code that matched raw maps with a non-struct guard (e.g. `when not is_struct(entry)`)
must be updated.

See `CHANGELOG` for the full list including the new `BIR121JednLokalnaOsPrawnej`
report support.

## Dependencies

- **httpoison** - HTTP client for SOAP requests
- **sweet_xml** - XML parsing and manipulation
- **exvcr** - HTTP request recording for tests (test only)
- **credo** - Static code analysis (dev/test only)
- **mock** - Mocking support for tests (test only)

## Documentation

Documentation can be generated with [ExDoc](https://github.com/elixir-lang/ex_doc)
and published on [HexDocs](https://hexdocs.pm). Once published, the docs can
be found at <https://hexdocs.pm/gusex>.

## Contributing

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Make your changes following the existing code style (do NOT run `mix format`)
4. Add tests for new functionality
5. Run tests and ensure they pass (`mix test`)
6. Run static analysis (`mix credo`)
7. Commit your changes
8. Push to the branch
9. Open a Pull Request

## License

This project is licensed under the MIT License - see the LICENCE file for details.

## Support

For issues, questions, or contributions, please visit the [BitBucket repository](https://bitbucket.org/silverdr/gusex/).

---

**Note**: This library provides access to official Polish government business registry data. Ensure compliance with applicable data protection regulations and service terms of use.