<p align="center">
<img src="assets/weaviate_ex.svg" alt="WeaviateEx Logo" width="200" height="200">
</p>
# WeaviateEx
[](https://elixir-lang.org)
[](https://hex.pm/packages/weaviate_ex)
[](https://hexdocs.pm/weaviate_ex)
[](LICENSE)
[](https://github.com/nshkrdotcom/weaviate_ex)
[](https://codecov.io/gh/nshkrdotcom/weaviate_ex)
[](https://github.com/nshkrdotcom/weaviate_ex)
A modern, idiomatic Elixir client for [Weaviate](https://weaviate.io) vector database (v1.28+) with **full Python client feature parity**.
## Features
### Core Capabilities
- **Complete API Coverage** - Collections, objects, batch operations, queries, aggregations, cross-references, tenants
- **RBAC & User Management** - Full role-based access control, user lifecycle management, OIDC groups
- **Hybrid Protocol Architecture** - gRPC for high-performance data operations, HTTP for schema management
- **Type-Safe** - Protocol-based architecture with comprehensive typespecs
- **Test-First Design** - 2600+ tests with Mox-based mocking for fast, isolated testing
- **Production-Ready** - gRPC persistent channels, Finch HTTP pooling, proper error handling, health checks
- **Easy Setup** - First-class Mix tasks for managing local Weaviate stacks
### Generative AI (RAG) - 20+ Providers
- **OpenAI** (GPT-4, GPT-3.5, O1/O3 reasoning models)
- **Anthropic** (Claude 3.5 Sonnet, Claude 3 Opus/Haiku)
- **Cohere**, **Google Vertex/Gemini**, **AWS Bedrock/SageMaker**
- **Mistral**, **Ollama**, **XAI (Grok)**, **ContextualAI**
- **NEW in v0.3**: NVIDIA NIM, Databricks, FriendliAI
- Typed provider configurations with full parameter support
- Multimodal generation with image support
### Vector Search
- **Semantic Search** - near_text, near_vector, near_object
- **Multimodal Search** - near_image (images), near_media (audio, video, thermal, depth, IMU)
- **Hybrid Search** - Combined keyword + vector with configurable alpha
- **BM25 Keyword Search** - Full-text search with AND/OR operators
- **Reranking** - gRPC-based result reranking with Cohere, Transformers, VoyageAI, and more
- **Multi-Vector Support** - ColBERT-style embeddings with Muvera encoding
- **Named Vectors** - Multiple vectors per object with targeting strategies
### Advanced Features
- **Cross-References** - Full CRUD for object relationships
- **Multi-Tenancy** - HOT, COLD, FROZEN, OFFLOADED states
- **Batch Operations** - Error tracking, retry logic, rate limit handling
- **Embedded Mode** - Run Weaviate without Docker
- **20+ Vectorizers** - OpenAI, Cohere, VoyageAI, Jina, Transformers, Ollama, and more
- **gRPC Batch Streaming** - High-performance bidirectional streaming (Weaviate 1.34+)
## Table of Contents
- [Quick Start](#quick-start)
- [Installation](#installation)
- [Configuration](#configuration)
- [Usage](#usage)
- [Embedded Mode](#embedded-mode)
- [Health Checks](#health-checks)
- [Server Version Detection](#server-version-detection)
- [Collections (Schema Management)](#collections-schema-management)
- [Data Operations (CRUD)](#data-operations-crud)
- [Objects API](#objects-api)
- [Batch Operations](#batch-operations)
- [Queries & Vector Search](#queries--vector-search)
- [Multimodal Search](#multimodal-search)
- [Aggregations](#aggregations)
- [Advanced Filtering](#advanced-filtering)
- [Vector Configuration](#vector-configuration)
- [Backup & Restore](#backup--restore)
- [Multi-Tenancy](#multi-tenancy)
- [RBAC (Role-Based Access Control)](#rbac-role-based-access-control)
- [User Management](#user-management)
- [Group Management](#group-management)
- [Examples](#examples)
- [Testing](#testing)
- [Mix Tasks](#mix-tasks)
- [Docker Management](#docker-management)
- [Authentication](#authentication)
- [Connection Management](#connection-management)
- [Debug & Troubleshooting](#debug--troubleshooting)
- [Documentation](#documentation)
- [Contributing](#contributing)
- [License](#license)
## Quick Start
### 1. Start Weaviate locally
> 🧰 **Prerequisite**: Docker Desktop (macOS/Windows) or Docker Engine (Linux)
We ship Docker Compose profiles from the Python client under `ci/`. Use our Mix tasks to bring everything up:
```bash
# Start Weaviate containers (default version: 1.35.0)
mix weaviate.start
# Or specify a version
mix weaviate.start --version 1.35.0
# Inspect running services and health status
mix weaviate.status
```
The first run downloads the Weaviate Docker image and waits for the `/v1/.well-known/ready` endpoint to return `200`.
When you're done:
```bash
mix weaviate.stop
```
> Prefer direct scripts? Use `./ci/start_weaviate.sh 1.35.0` and `./ci/stop_weaviate.sh`.
### 2. Add to Your Project
Add `weaviate_ex` to your `mix.exs` dependencies:
```elixir
def deps do
[
{:weaviate_ex, "~> 0.7.4"}
]
end
```
Then fetch dependencies:
```bash
mix deps.get
```
### 3. Configure
The library automatically reads from environment variables (loaded from `.env`):
```bash
# .env file (created by install.sh)
WEAVIATE_URL=http://localhost:8080
WEAVIATE_API_KEY= # Optional, for authenticated instances
```
Or configure in your Elixir config files:
```elixir
# config/config.exs
config :weaviate_ex,
url: "http://localhost:8080",
api_key: nil, # Optional
strict: true # Default: true - fails fast if Weaviate is unreachable
```
**Strict Mode**: By default, WeaviateEx validates connectivity on startup. If Weaviate is unreachable, your application won't start. Set `strict: false` to allow startup anyway (useful for development when Weaviate might not always be running).
### 4. Verify Connection
The library automatically performs a health check on startup:
```
[WeaviateEx] Successfully connected to Weaviate
URL: http://localhost:8080
Version: 1.34.0-rc.0
```
You can also run `mix weaviate.status` to see every profile that’s currently online and the ports they expose.
If configuration is missing, you'll get helpful error messages:
```
╔══════════════════════════════════════════════════════════════════════╗
║ WeaviateEx Configuration Error ║
╠══════════════════════════════════════════════════════════════════════╣
║ Missing required configuration: WEAVIATE_URL ║
║ ║
║ Please set the Weaviate URL using one of these methods: ║
║ 1. Environment variable: export WEAVIATE_URL=http://localhost:8080 ║
║ 2. Application configuration (config/config.exs) ║
║ 3. Runtime configuration (config/runtime.exs) ║
╚══════════════════════════════════════════════════════════════════════╝
```
### 5. Shape a Tenant-Aware Collection and Load Data
```elixir
alias WeaviateEx.{Collections, Objects, Batch}
# Define the collection and toggle multi-tenancy when ready
{:ok, _collection} =
Collections.create("Article", %{
description: "Articles by tenant",
properties: [
%{name: "title", dataType: ["text"]},
%{name: "content", dataType: ["text"]}
]
})
{:ok, %{"enabled" => true}} = Collections.set_multi_tenancy("Article", true)
{:ok, true} = Collections.exists?("Article")
# Create & read tenant-scoped objects with _additional metadata
{:ok, created} =
Objects.create("Article", %{properties: %{title: "Tenant scoped", content: "Hello!"}},
tenant: "tenant-a"
)
{:ok, fetched} =
Objects.get("Article", created["id"],
tenant: "tenant-a",
include: ["_additional", "vector"]
)
# Batch ingest with a summary that separates successes from errors
objects =
Enum.map(1..3, fn idx ->
%{class: "Article", properties: %{title: "Story #{idx}"}, tenant: "tenant-a"}
end)
{:ok, summary} = Batch.create_objects(objects, return_summary: true, tenant: "tenant-a")
summary.statistics
#=> %{processed: 3, successful: 3, failed: 0}
```
## Installation
See [INSTALL.md](INSTALL.md) for detailed installation instructions covering:
- Docker installation on various platforms
- Manual Weaviate setup
- Configuration options
- Troubleshooting
## Configuration
### Environment Variables
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `WEAVIATE_URL` | Yes | - | Full URL to Weaviate (e.g., `http://localhost:8080`) |
| `WEAVIATE_API_KEY` | No | - | API key for authentication (for cloud/production) |
### Application Configuration
```elixir
# config/config.exs
config :weaviate_ex,
url: System.get_env("WEAVIATE_URL", "http://localhost:8080"),
api_key: System.get_env("WEAVIATE_API_KEY"),
strict: true, # Fail on startup if unreachable
timeout: 30_000 # Request timeout in milliseconds
```
### gRPC Configuration
WeaviateEx v0.4.0+ uses a hybrid protocol architecture: gRPC for data operations (queries, batch, aggregations) and HTTP for schema management. gRPC provides significantly better performance for high-throughput operations.
```elixir
# config/config.exs
config :weaviate_ex,
url: "http://localhost:8080", # HTTP endpoint for schema operations
grpc_host: "localhost", # gRPC host (default: derived from url)
grpc_port: 50051, # gRPC port (default: 50051)
grpc_max_message_size: 104_857_600, # Max message size in bytes (default: 100MB)
api_key: nil # Used for both HTTP and gRPC auth
```
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `grpc_host` | No | Derived from `url` | gRPC endpoint hostname |
| `grpc_port` | No | `50051` | gRPC port |
| `grpc_max_message_size` | No | `104857600` | Maximum gRPC message size (100MB) |
The gRPC connection is automatically established when you create a client:
```elixir
# Connect with gRPC (automatic)
{:ok, client} = WeaviateEx.Client.connect(
url: "http://localhost:8080",
grpc_port: 50051
)
# Client now has both HTTP and gRPC channels
client.grpc_channel # => gRPC channel for data operations
client.config # => Configuration for HTTP operations
```
### Custom Headers (v0.7.1+)
Add custom headers to all HTTP and gRPC requests for authentication, tracing, or other purposes:
```elixir
# Configure additional headers in client config
{:ok, client} = WeaviateEx.Client.connect(
url: "http://localhost:8080",
additional_headers: %{
"X-Custom-Header" => "custom-value",
"X-Request-ID" => "trace-123",
"Authorization" => "Bearer custom-token"
}
)
# Headers are automatically included in:
# - All HTTP requests (schema operations, health checks)
# - All gRPC requests as metadata (lowercased keys)
```
Headers are validated on client creation - nil values will raise an `ArgumentError`.
### gRPC Retry with Exponential Backoff (v0.7.1+)
All gRPC operations automatically retry on transient errors with exponential backoff:
```elixir
# Retryable gRPC status codes:
# - UNAVAILABLE (14) - Service temporarily unavailable
# - RESOURCE_EXHAUSTED (8) - Rate limiting
# - ABORTED (10) - Transaction aborted
# - DEADLINE_EXCEEDED (4) - Timeout
# Default: 4 retries with exponential backoff
# Attempt 0: 1 second delay
# Attempt 1: 2 seconds
# Attempt 2: 4 seconds
# Attempt 3: 8 seconds
# Maximum delay capped at 32 seconds
# Configure retry behavior (optional)
alias WeaviateEx.GRPC.Retry
# Custom retry with options
result = Retry.with_retry(
fn -> some_grpc_operation() end,
max_retries: 3,
base_delay_ms: 500
)
# Check if error is retryable
Retry.retryable?(%GRPC.RPCError{status: 14}) # => true (UNAVAILABLE)
Retry.retryable?(%GRPC.RPCError{status: 3}) # => false (INVALID_ARGUMENT)
# Calculate backoff delay
Retry.calculate_backoff(0) # => 1000ms
Retry.calculate_backoff(2) # => 4000ms
Retry.calculate_backoff(5) # => 32000ms (capped)
```
All gRPC services (Search, Batch, Aggregate, Tenants, Health) automatically use retry logic.
### Proxy Configuration (v0.5.0+)
WeaviateEx supports HTTP, HTTPS, and gRPC proxy configuration:
```elixir
alias WeaviateEx.Config.Proxy
# Read from environment variables (HTTP_PROXY, HTTPS_PROXY, GRPC_PROXY)
proxy = Proxy.from_env()
# Or configure explicitly
proxy = Proxy.new(
http: "http://proxy.example.com:8080",
https: "https://proxy.example.com:8443",
grpc: "http://grpc-proxy.example.com:8080"
)
# Check if proxy is configured
Proxy.configured?(proxy) # => true
# Get Finch HTTP client options
Proxy.to_finch_opts(proxy) # => [proxy: {:https, "proxy.example.com", 8443, []}]
# Get gRPC channel options
Proxy.to_grpc_opts(proxy) # => [http_proxy: "http://grpc-proxy.example.com:8080"]
```
Environment variables are read case-insensitively (uppercase takes precedence):
- `HTTP_PROXY` / `http_proxy` - HTTP proxy URL
- `HTTPS_PROXY` / `https_proxy` - HTTPS proxy URL
- `GRPC_PROXY` / `grpc_proxy` - gRPC proxy URL
### Runtime Configuration (Recommended for Production)
```elixir
# config/runtime.exs
config :weaviate_ex,
url: System.fetch_env!("WEAVIATE_URL"),
api_key: System.get_env("WEAVIATE_API_KEY")
```
## Usage
### Embedded Mode
Need an ephemeral instance without Docker? WeaviateEx can download and manage the official embedded binary:
```elixir
# Downloads (once) into ~/.cache/weaviate-embedded and starts the process
{:ok, embedded} =
WeaviateEx.start_embedded(
version: "1.34.0",
port: 8099,
grpc_port: 50155,
persistence_data_path: Path.expand("tmp/weaviate-data"),
environment_variables: %{"DISABLE_TELEMETRY" => "true"}
)
# Talk to it just like any other instance
System.put_env("WEAVIATE_URL", "http://localhost:8099")
{:ok, meta} = WeaviateEx.health_check()
# Always stop the handle when finished
:ok = WeaviateEx.stop_embedded(embedded)
```
Passing `version: "latest"` fetches the most recent GitHub release. Binaries are cached, so subsequent calls reuse the download. You can override `binary_path`/`persistence_data_path` to control where the executable and data live.
### Health Checks
Check if Weaviate is accessible and get version information:
```elixir
# Get metadata (version, modules)
{:ok, meta} = WeaviateEx.health_check()
# => %{"version" => "1.34.0-rc.0", "modules" => %{}}
# Check readiness (can handle requests) - K8s readiness probe
{:ok, true} = WeaviateEx.ready?()
# Check liveness (service is up) - K8s liveness probe
{:ok, true} = WeaviateEx.alive?()
# With explicit client
{:ok, client} = WeaviateEx.Client.connect(base_url: "http://localhost:8080")
{:ok, true} = WeaviateEx.Health.alive?(client)
{:ok, true} = WeaviateEx.Health.ready?(client)
# Wait for Weaviate to become ready (useful for startup scripts)
:ok = WeaviateEx.Health.wait_until_ready(timeout: 30_000, check_interval: 1000)
# gRPC health ping (v0.7.0+)
alias WeaviateEx.GRPC.Services.Health, as: GRPCHealth
:ok = GRPCHealth.ping(client.grpc_channel)
```
#### Kubernetes Integration
The `alive?` and `ready?` functions use the standard Kubernetes probe endpoints:
- **Liveness**: `/.well-known/live` - Is the process running?
- **Readiness**: `/.well-known/ready` - Can the service handle traffic?
```yaml
# Example K8s deployment liveness/readiness probes
livenessProbe:
httpGet:
path: /.well-known/live
port: 8080
readinessProbe:
httpGet:
path: /.well-known/ready
port: 8080
```
### Server Version Detection
Parse and validate Weaviate server versions (v0.7.0+):
```elixir
alias WeaviateEx.Version
# Parse version strings
{:ok, {1, 28, 0}} = Version.parse("1.28.0")
{:ok, {1, 28, 0}} = Version.parse("v1.28.0-rc1") # Handles v prefix and prerelease
# Check if version meets minimum requirement
true = Version.meets_minimum?({1, 28, 0}, {1, 27, 0})
false = Version.meets_minimum?({1, 26, 0}, {1, 27, 0})
# Validate server version (minimum: 1.27.0)
:ok = Version.validate_server({1, 28, 0})
{:error, {:unsupported_version, {1, 20, 0}, {1, 27, 0}}} = Version.validate_server({1, 20, 0})
# Extract version from meta endpoint response
{:ok, meta} = WeaviateEx.health_check()
{:ok, {1, 28, 0}} = Version.get_server_version(meta)
# Get minimum supported version
Version.minimum_version() # => {1, 27, 0}
Version.minimum_version_string() # => "1.27.0"
# Format version tuple to string
"1.28.0" = Version.format_version({1, 28, 0})
```
### Collections (Schema Management)
Collections define the structure of your data:
```elixir
# Create a collection with properties
{:ok, collection} = WeaviateEx.Collections.create("Article", %{
description: "News articles",
properties: [
%{name: "title", dataType: ["text"]},
%{name: "content", dataType: ["text"]},
%{name: "publishedAt", dataType: ["date"]},
%{name: "views", dataType: ["int"]}
],
vectorizer: "none" # Use "text2vec-openai" for auto-vectorization
})
# List all collections
{:ok, schema} = WeaviateEx.Collections.list()
# Get a specific collection
{:ok, collection} = WeaviateEx.Collections.get("Article")
# Add a property to existing collection
{:ok, property} = WeaviateEx.Collections.add_property("Article", %{
name: "author",
dataType: ["text"]
})
# Check if collection exists
{:ok, true} = WeaviateEx.Collections.exists?("Article")
# Delete a collection
{:ok, _} = WeaviateEx.Collections.delete("Article")
```
### Object TTL (Time-To-Live)
Automatically expire and delete objects after a specified duration:
```elixir
alias WeaviateEx.Config.ObjectTTL
# Create collection with 24-hour TTL using human-readable duration
{:ok, _} = WeaviateEx.Collections.create("Events", %{
properties: [%{name: "title", dataType: ["text"]}],
object_ttl: ObjectTTL.from_duration(hours: 24)
})
# Or specify exact seconds with creation time deletion
{:ok, _} = WeaviateEx.Collections.create("Sessions", %{
properties: [%{name: "user_id", dataType: ["text"]}],
object_ttl: ObjectTTL.delete_by_creation_time(3600) # 1 hour
})
# Delete objects based on last update time
{:ok, _} = WeaviateEx.Collections.create("Cache", %{
properties: [%{name: "data", dataType: ["text"]}],
object_ttl: ObjectTTL.delete_by_update_time(86_400, true) # 24h, filter expired
})
# Delete objects based on a custom date property
{:ok, _} = WeaviateEx.Collections.create("Subscriptions", %{
properties: [
%{name: "plan", dataType: ["text"]},
%{name: "expires_at", dataType: ["date"]}
],
object_ttl: ObjectTTL.delete_by_date_property("expires_at")
})
# Update TTL on existing collection
{:ok, _} = WeaviateEx.Collections.update_ttl("Events",
ObjectTTL.from_duration(days: 7)
)
# Disable TTL
{:ok, _} = WeaviateEx.Collections.update_ttl("Events",
ObjectTTL.disable()
)
```
**Note:** Objects are deleted asynchronously in the background. The `filter_expired_objects`
option (second parameter in `delete_by_*` functions) controls whether expired but not yet
deleted objects are excluded from search results.
Schema helpers for range filters and auto-tenant configuration:
```elixir
alias WeaviateEx.Config.{AutoTenant, ObjectTTL}
alias WeaviateEx.Schema.MultiTenancyConfig
alias WeaviateEx.Property
ttl = ObjectTTL.delete_by_update_time(86_400, true)
{:ok, _} = WeaviateEx.Collections.create("Session", %{
properties: [
Property.number("expires_in", index_range_filters: true)
],
object_ttl: ttl,
multi_tenancy_config: MultiTenancyConfig.new(enabled: true, auto_tenant_creation: true),
auto_tenant: AutoTenant.enable(auto_delete_timeout: 3_600)
})
```
### Nested Properties
Define complex object structures with nested properties:
```elixir
alias WeaviateEx.Property
alias WeaviateEx.Property.Nested
# Create a collection with nested object properties
{:ok, _} = WeaviateEx.Collections.create("Product", %{
description: "Products with specifications",
properties: [
%{name: "name", dataType: ["text"]},
%{name: "price", dataType: ["number"]},
# Nested object property
Property.object("specs", [
Nested.new(name: "weight", data_type: :number),
Nested.new(name: "dimensions", data_type: :text),
Nested.new(name: "material", data_type: :text)
]),
# Array of nested objects
Property.object_array("variants", [
Nested.new(name: "color", data_type: :text),
Nested.new(name: "size", data_type: :text),
Nested.new(name: "sku", data_type: :text),
Nested.new(name: "stock", data_type: :int)
])
]
})
# Insert object with nested data
{:ok, product} = WeaviateEx.Objects.create("Product", %{
properties: %{
name: "Laptop Stand",
price: 79.99,
specs: %{
weight: 2.5,
dimensions: "30x25x15cm",
material: "aluminum"
},
variants: [
%{color: "silver", size: "standard", sku: "LS-001", stock: 50},
%{color: "black", size: "large", sku: "LS-002", stock: 30}
]
}
})
# Deeply nested properties (object within object)
{:ok, _} = WeaviateEx.Collections.create("Company", %{
properties: [
%{name: "name", dataType: ["text"]},
Property.object("headquarters", [
Nested.new(name: "city", data_type: :text),
Nested.new(name: "country", data_type: :text),
Nested.new(
name: "address",
data_type: :object,
nested_properties: [
Nested.new(name: "street", data_type: :text),
Nested.new(name: "zip", data_type: :text)
]
)
])
]
})
# Parse nested properties from API response
api_data = %{
"name" => "specs",
"dataType" => ["object"],
"nestedProperties" => [
%{"name" => "weight", "dataType" => ["number"]}
]
}
nested = Nested.from_api(api_data)
```
### Data Operations (CRUD)
Simple CRUD operations with automatic UUID generation:
```elixir
alias WeaviateEx.API.Data
# Create (insert) a new object
data = %{
properties: %{
"title" => "Hello Weaviate",
"content" => "This is a test article",
"views" => 0
},
vector: [0.1, 0.2, 0.3, 0.4, 0.5] # Optional if using auto-vectorization
}
{:ok, object} = Data.insert(client, "Article", data)
# Named vectors (v0.7.1+) - for collections with multiple vector spaces
data_with_named_vectors = %{
properties: %{"title" => "Multi-vector article"},
vectors: %{
"title_vector" => [0.1, 0.2, 0.3],
"content_vector" => [0.4, 0.5, 0.6, 0.7]
}
}
{:ok, object} = Data.insert(client, "MultiVectorCollection", data_with_named_vectors)
uuid = object["id"]
# Read - get object by ID
{:ok, retrieved} = Data.get_by_id(client, "Article", uuid)
# Update - partial update (PATCH)
{:ok, updated} = Data.patch(client, "Article", uuid, %{
properties: %{"views" => 42},
vector: [0.1, 0.2, 0.3, 0.4, 0.5]
})
# Check if object exists
{:ok, true} = Data.exists?(client, "Article", uuid)
# Delete
{:ok, _} = Data.delete_by_id(client, "Article", uuid)
```
Collection handles with default tenant/consistency:
```elixir
collection =
WeaviateEx.Collection.new(client, "Article",
tenant: "tenant-a",
consistency_level: "QUORUM"
)
{:ok, _} = WeaviateEx.Collection.insert(collection, %{properties: %{title: "Tenant scoped"}})
```
#### Inline References During Insert (v0.7.1+)
Create objects with references in a single operation:
```elixir
# Insert object with inline references
{:ok, article} = WeaviateEx.Objects.create("Article", %{
properties: %{
title: "My Article",
content: "Article content..."
},
# Single reference
references: %{
"hasAuthor" => "author-uuid-here"
}
})
# Multiple references to same property
{:ok, article} = WeaviateEx.Objects.create("Article", %{
properties: %{title: "Collaborative Article"},
references: %{
"hasAuthors" => ["author-uuid-1", "author-uuid-2", "author-uuid-3"]
}
})
# Multi-target references (pointing to specific collection)
{:ok, article} = WeaviateEx.Objects.create("Article", %{
properties: %{title: "Related Content"},
references: %{
"relatedTo" => %{
target_collection: "Category",
uuids: "category-uuid"
}
}
})
# Multiple multi-target references
{:ok, article} = WeaviateEx.Objects.create("Article", %{
properties: %{title: "Multi-related"},
references: %{
"mentions" => %{
target_collection: "Person",
uuids: ["person-1", "person-2"]
}
}
})
```
References are automatically converted to Weaviate beacon format.
#### Reference Operations API (v0.7.3+)
For managing references after object creation, use the References API with full multi-target support:
```elixir
alias WeaviateEx.API.References
alias WeaviateEx.Data.ReferenceToMulti
alias WeaviateEx.Types.Beacon
# Add a single reference
{:ok, _} = References.add(client, "Article", article_uuid, "hasAuthor", author_uuid)
# Add a multi-target reference using ReferenceToMulti
ref = ReferenceToMulti.new("Person", person_uuid)
{:ok, _} = References.add(client, "Article", article_uuid, "hasAuthor", ref)
# Add multiple references at once
ref = ReferenceToMulti.new("Person", [person1_uuid, person2_uuid])
{:ok, _} = References.add(client, "Article", article_uuid, "hasAuthors", ref)
# Replace all references on a property
{:ok, _} = References.replace(client, "Article", article_uuid, "hasAuthors",
[author1_uuid, author2_uuid, author3_uuid]
)
# Replace with multi-target references pointing to different collections
{:ok, _} = References.replace(client, "Article", article_uuid, "relatedTo", [
ReferenceToMulti.new("Person", person_uuid),
ReferenceToMulti.new("Organization", org_uuid)
])
# Delete a reference
{:ok, _} = References.delete(client, "Article", article_uuid, "hasAuthor", author_uuid)
# Batch add references
refs = [
%{from_uuid: "article-1", from_property: "hasAuthor", to_uuid: "author-1"},
%{from_uuid: "article-2", from_property: "hasAuthor", to_uuid: "author-2",
target_collection: "Person"} # For multi-target properties
]
{:ok, _} = References.add_many(client, "Article", refs)
# Parse beacon URLs
parsed = Beacon.parse("weaviate://localhost/Person/uuid-123")
# => %{collection: "Person", uuid: "uuid-123"}
# Build beacon URLs
beacon = Beacon.build("uuid-123", "Person")
# => "weaviate://localhost/Person/uuid-123"
```
### Objects API
Full CRUD operations with explicit UUID control:
```elixir
# Create with custom UUID
{:ok, object} = WeaviateEx.Objects.create("Article", %{
id: "custom-uuid-here", # Optional
properties: %{
title: "Hello Weaviate",
content: "This is a test article",
publishedAt: "2025-01-15T10:00:00Z"
},
vector: [0.1, 0.2, 0.3] # Optional
})
# Get an object with additional fields
{:ok, object} = WeaviateEx.Objects.get("Article", uuid,
include: "vector,classification"
)
# List objects with pagination
{:ok, result} = WeaviateEx.Objects.list("Article",
limit: 10,
offset: 0,
include: "vector"
)
# Update (full replacement)
{:ok, updated} = WeaviateEx.Objects.update("Article", uuid, %{
properties: %{
title: "Updated Title",
content: "Updated content"
}
})
# Patch (partial update)
{:ok, patched} = WeaviateEx.Objects.patch("Article", uuid, %{
properties: %{title: "New Title"}
})
# Delete
{:ok, _} = WeaviateEx.Objects.delete("Article", uuid)
# Check existence
{:ok, true} = WeaviateEx.Objects.exists?("Article", uuid)
```
Payload validation happens client-side: `properties` is required for inserts/updates, and
property names `id` and `vector` are reserved (raises `ArgumentError`).
### Complex Data Types
WeaviateEx automatically serializes complex Elixir types when creating or updating objects:
```elixir
alias WeaviateEx.Types.{GeoCoordinate, PhoneNumber, Blob}
# DateTime - serialized to RFC3339/ISO8601
%{created_at: ~U[2024-01-01 00:00:00Z]}
# -> {"created_at": "2024-01-01T00:00:00Z"}
# Date - serialized as midnight UTC
%{published_date: ~D[2024-06-15]}
# -> {"published_date": "2024-06-15T00:00:00Z"}
# GeoCoordinate - serialized to lat/lon map
{:ok, geo} = GeoCoordinate.new(40.71, -74.00)
%{location: geo}
# -> {"location": {"latitude": 40.71, "longitude": -74.00}}
# PhoneNumber - serialized with input and country
phone = PhoneNumber.new("555-1234", default_country: "US")
%{contact: phone}
# -> {"contact": {"input": "555-1234", "defaultCountry": "US"}}
# Blob (binary data) - base64 encoded
blob = Blob.new(<<binary_image_data>>)
%{image: blob}
# -> {"image": "<base64 encoded string>"}
# Nested objects with complex types
{:ok, geo} = GeoCoordinate.new(40.7128, -74.0060)
{:ok, article} = WeaviateEx.Objects.create("Place", %{
properties: %{
name: "Central Park",
location: geo,
created_at: ~U[2024-01-01 00:00:00Z],
metadata: %{
last_visited: ~D[2024-12-25]
}
}
})
```
#### Deserializing Responses
Convert Weaviate response data back to rich Elixir types:
```elixir
alias WeaviateEx.Types.Deserialize
# Parse individual values
{:ok, dt} = Deserialize.deserialize("2024-01-01T00:00:00Z", :date)
# => {:ok, ~U[2024-01-01 00:00:00Z]}
{:ok, geo} = Deserialize.deserialize(
%{"latitude" => 52.37, "longitude" => 4.90},
:geo_coordinates
)
# => {:ok, %GeoCoordinate{latitude: 52.37, longitude: 4.90}}
# Deserialize properties with schema hints
schema = %{"created_at" => :date, "location" => :geo_coordinates}
{:ok, props} = Deserialize.deserialize_properties(raw_props, schema)
# Auto-detect types based on value structure
{:ok, props} = Deserialize.auto_deserialize(response["properties"])
```
### Batch Operations
Efficient bulk operations for importing large datasets:
```elixir
# Batch create multiple objects
objects = [
%{class: "Article", properties: %{title: "Article 1", content: "Content 1"}},
%{class: "Article", properties: %{title: "Article 2", content: "Content 2"}},
%{class: "Article", properties: %{title: "Article 3", content: "Content 3"}}
]
{:ok, summary} = WeaviateEx.Batch.create_objects(objects, return_summary: true)
# Check rolled-up stats and per-object errors
summary.statistics
#=> %{processed: 3, successful: 3, failed: 0}
Enum.each(summary.errors, fn error ->
Logger.warn("[Batch error] #{error.id} => #{Enum.join(error.messages, "; ")}")
end)
If every object in the batch fails, `Batch.create_objects/2` returns
`{:error, %WeaviateEx.Error{type: :batch_all_failed}}`.
# Batch delete with criteria (WHERE filter)
{:ok, result} = WeaviateEx.Batch.delete_objects(%{
class: "Article",
where: %{
path: ["status"],
operator: "Equal",
valueText: "draft"
}
})
```
### Concurrent Batch Operations
High-throughput parallel batch processing with failure tracking:
```elixir
alias WeaviateEx.Batch.Concurrent
alias WeaviateEx.Batch.Queue
# Concurrent batch insertion with parallel processing
objects = Enum.map(1..10_000, fn i ->
%{class: "Article", properties: %{title: "Article #{i}", content: "Content #{i}"}}
end)
{:ok, result} = Concurrent.insert_many(client, "Article", objects,
max_concurrency: 8, # Parallel batch requests
batch_size: 200, # Objects per request
ordered: false, # Don't maintain order (faster)
timeout: 60_000 # Timeout per batch
)
# Check results
IO.puts(Concurrent.Result.summary(result))
# => "Inserted 10000/10000 objects in 50 batches (1234ms). Failures: 0, Batch errors: 0"
if Concurrent.Result.all_successful?(result) do
IO.puts("All objects inserted successfully!")
else
IO.puts("Some failures occurred")
for failed <- result.failed do
IO.puts("Failed: #{failed.id} - #{failed.error}")
end
end
# Batch Queue for failure tracking and re-queuing
queue = Queue.new()
# Add objects to queue
queue = Enum.reduce(objects, queue, fn obj, q ->
Queue.enqueue(q, obj)
end)
# Dequeue a batch for processing
{batch, queue} = Queue.dequeue_batch(queue, 100)
# Process batch and mark failures
queue = Enum.reduce(failed_objects, queue, fn {obj, reason}, q ->
Queue.mark_failed(q, obj, reason)
end)
# Re-queue failed objects for retry (with max retry limit)
queue = Queue.requeue_failed(queue, max_retries: 3)
# Get queue statistics
IO.puts("Pending: #{Queue.pending_count(queue)}")
IO.puts("Failed: #{Queue.failed_count(queue)}")
IO.puts("Empty: #{Queue.empty?(queue)}")
# Rate limit detection
alias WeaviateEx.Batch.RateLimit
response = %{status: 429, headers: [{"retry-after", "5"}]}
case RateLimit.detect(response) do
:ok -> IO.puts("No rate limit")
{:rate_limited, wait_ms} ->
IO.puts("Rate limited, wait #{wait_ms}ms")
Process.sleep(wait_ms)
end
# Server queue monitoring for dynamic batch sizing
alias WeaviateEx.API.Cluster
{:ok, stats} = Cluster.batch_stats(client)
IO.puts("Queue length: #{stats.queue_length}")
IO.puts("Rate: #{stats.rate_per_second}/s")
IO.puts("Failed: #{stats.failed_count}")
```
### gRPC Batch Streaming (v0.6.0+)
Bidirectional gRPC streaming for high-throughput batch operations (requires Weaviate 1.34+):
```elixir
alias WeaviateEx.Batch.Stream
# Create a streaming batch session
{:ok, stream} = Stream.new(client, "Article",
buffer_size: 200, # Objects per batch
flush_interval_ms: 1000, # Auto-flush interval
auto_flush: true # Enable automatic flushing
)
# Add objects to the stream buffer
{:ok, stream} = Stream.add(stream, %{
properties: %{title: "Article 1", content: "Content 1"}
})
{:ok, stream} = Stream.add(stream, %{
properties: %{title: "Article 2", content: "Content 2"}
})
# Manually flush when buffer reaches threshold
{:ok, stream} = Stream.flush(stream)
# Add many objects efficiently
objects = Enum.map(1..1000, fn i ->
%{properties: %{title: "Article #{i}", content: "Content #{i}"}}
end)
{:ok, stream} = Enum.reduce(objects, {:ok, stream}, fn obj, {:ok, s} ->
Stream.add(s, obj)
end)
# Close stream and get final results
{:ok, results} = Stream.close(stream)
# Results include success/failure for each object
Enum.each(results, fn result ->
case result do
%{status: :success, uuid: uuid} ->
IO.puts("Created: #{uuid}")
%{status: :failed, error: error} ->
IO.puts("Failed: #{error}")
end
end)
```
When the server sends backoff messages, the stream automatically updates its
buffer size to the server-provided batch size for subsequent flushes.
#### Low-Level gRPC Streaming
For advanced use cases, access the underlying gRPC stream directly:
```elixir
alias WeaviateEx.GRPC.Services.BatchStream
# Open a bidirectional stream
{:ok, stream_handle} = BatchStream.open(client.grpc_channel)
# Send objects
:ok = BatchStream.send_objects(stream_handle, [
%{collection: "Article", properties: %{title: "Test"}, uuid: nil, vector: nil}
])
# Send cross-references
:ok = BatchStream.send_references(stream_handle, [
%{from_collection: "Article", from_uuid: "...", to_collection: "Author", to_uuid: "..."}
])
# Receive results
{:ok, results} = BatchStream.receive_results(stream_handle, timeout: 5000)
# Close the stream
:ok = BatchStream.close(stream_handle)
```
### Background Batch Processing (v0.7.0+)
For high-throughput scenarios, use the background batcher for continuous async processing:
```elixir
alias WeaviateEx.Batch.Background
# Start a background batch processor
{:ok, batcher} = WeaviateEx.Batch.background(client, "Article",
batch_size: 100,
concurrent_requests: 2,
flush_interval: 1000
)
# Add objects asynchronously (non-blocking)
for article <- articles do
:ok = Background.add_object(batcher, %{
title: article.title,
content: article.content
})
end
# Add objects with explicit UUID and vector
:ok = Background.add_object(batcher, %{title: "Test"},
uuid: "550e8400-e29b-41d4-a716-446655440000",
vector: [0.1, 0.2, 0.3]
)
# Add references (automatically ordered after related objects)
:ok = Background.add_reference(batcher, article_uuid, "hasAuthor", author_uuid)
# Force immediate flush
:ok = Background.flush(batcher)
# Get current results
results = Background.get_results(batcher)
IO.puts("Imported #{map_size(results.successful_uuids)} objects")
# Stop and get final results (with flush)
results = Background.stop(batcher, flush: true)
```
### Batch Safety Features (v0.7.4+)
WeaviateEx implements production-grade batch safety for reliable large-scale operations:
#### Memory Management
```elixir
# MAX_STORED_RESULTS limit (100,000) prevents memory exhaustion
# Automatic eviction of oldest entries when limit exceeded
alias WeaviateEx.Batch.ErrorTracking.Results
# Check the limit
Results.max_stored_results()
#=> 100_000
# Results automatically evict oldest entries when limit is exceeded
# This prevents unbounded memory growth during large batch operations
```
#### Auto-Retry for Failed Objects
```elixir
alias WeaviateEx.Batch.Dynamic
# Dynamic batcher with auto-retry enabled (default)
{:ok, batcher} = Dynamic.start(
client: client,
auto_retry: true, # Enable automatic retry (default: true)
max_retries: 5, # Maximum retry attempts (default: 3)
retry_delay_ms: 2000, # Base delay for backoff (default: 1000ms)
on_permanent_failure: fn objects ->
Logger.error("Permanent failures: #{length(objects)}")
# Handle objects that exceeded max_retries
end
)
# Add objects - failed objects are automatically re-queued
Dynamic.add_object(batcher, "Article", %{title: "Test"})
# Retryable errors include:
# - Rate limit errors (429, "rate limit exceeded", etc.)
# - Transient gRPC errors (UNAVAILABLE, RESOURCE_EXHAUSTED, ABORTED, DEADLINE_EXCEEDED)
```
#### RetryQueue for Manual Control
```elixir
alias WeaviateEx.Batch.RetryQueue
# Start a retry queue for manual control
{:ok, retry_queue} = RetryQueue.start_link(
client: client,
max_retries: 3,
base_delay_ms: 1000,
on_permanent_failure: fn objects ->
Logger.error("Failed after max retries: #{length(objects)}")
end
)
# Enqueue failed objects for retry
:ok = RetryQueue.enqueue_failed(retry_queue, failed_objects)
# Check retry count for a specific object
count = RetryQueue.get_retry_count(retry_queue, "uuid-123")
# Drain all queued objects for manual processing
{:ok, objects} = RetryQueue.drain(retry_queue)
# Clear the queue
:ok = RetryQueue.clear(retry_queue)
```
#### Configurable Batch Options
```elixir
alias WeaviateEx.Batch.Config
# Create a batch configuration
config = Config.new(
max_stored_results: 50_000, # Custom limit
auto_retry: true,
max_retries: 5,
retry_delay_ms: 2000,
on_permanent_failure: fn objects ->
Logger.error("Failed: #{length(objects)}")
end
)
# Access configuration values
Config.auto_retry_enabled?(config) #=> true
Config.default_max_retries() #=> 3
```
### Queries & Vector Search
Powerful query capabilities with semantic search:
```elixir
alias WeaviateEx.Query
# Simple query with field selection
query = Query.get("Article")
|> Query.fields(["title", "content", "publishedAt"])
|> Query.limit(10)
{:ok, results} = Query.execute(query)
# Semantic search with near_text (requires vectorizer)
query = Query.get("Article")
|> Query.near_text("artificial intelligence", certainty: 0.7)
|> Query.fields(["title", "content"])
|> Query.additional(["certainty", "distance"])
|> Query.limit(5)
{:ok, results} = Query.execute(query)
# Vector search with custom vectors
query = Query.get("Article")
|> Query.near_vector([0.1, 0.2, 0.3], certainty: 0.8)
|> Query.fields(["title"])
{:ok, results} = Query.execute(query)
# Hybrid search (combines keyword + vector)
query = Query.get("Article")
|> Query.hybrid("machine learning", alpha: 0.5) # alpha: 0=keyword, 1=vector
|> Query.fields(["title", "content"])
{:ok, results} = Query.execute(query)
# BM25 keyword search
query = Query.get("Article")
|> Query.bm25("elixir programming")
|> Query.fields(["title", "content"])
{:ok, results} = Query.execute(query)
# Semantic direction with Move (v0.5.0+)
query = Query.get("Article")
|> Query.near_text("technology",
move_to: [concepts: ["artificial intelligence", "machine learning"], force: 0.8],
move_away: [concepts: ["politics", "sports"], force: 0.5]
)
|> Query.fields(["title", "content"])
{:ok, results} = Query.execute(query)
# Queries with filters (WHERE clause)
query = Query.get("Article")
|> Query.where(%{
path: ["publishedAt"],
operator: "GreaterThan",
valueDate: "2025-01-01T00:00:00Z"
})
|> Query.fields(["title", "publishedAt"])
|> Query.sort([%{path: ["publishedAt"], order: "desc"}])
{:ok, results} = Query.execute(query)
```
#### Fetch Objects by IDs
```elixir
alias WeaviateEx.API.Data
ids = [
"550e8400-e29b-41d4-a716-446655440001",
"550e8400-e29b-41d4-a716-446655440002"
]
{:ok, objects} = Data.fetch_objects_by_ids(client, "Article", ids,
return_properties: ["title", "content"]
)
# Results preserve the input ID order.
```
```elixir
# Using the Objects module (no client needed)
{:ok, objects} = WeaviateEx.Objects.fetch_objects_by_ids("Article", ids,
return_properties: ["title", "content"]
)
```
#### gRPC vs GraphQL
When you pass a `WeaviateEx.Client`, `Query.execute/2` uses gRPC and now supports
filters, group_by, target vectors, near_image/near_media, references, vector metadata,
reranking, and generative search (RAG). If a query includes options not yet supported in gRPC
(for example sorting or cursor pagination), it automatically falls back to GraphQL.
#### Reranking
Improve search result relevance using reranker models:
```elixir
alias WeaviateEx.Query
alias WeaviateEx.Query.Rerank
# Basic reranking - re-scores results using the "content" property
rerank = Rerank.new("content")
{:ok, results} = Query.get("Article")
|> Query.near_text("machine learning")
|> Query.fields(["title", "content"])
|> Query.limit(10)
|> Query.rerank(rerank)
|> Query.execute(client)
# With custom rerank query (different from search query)
rerank = Rerank.new("content", query: "latest AI applications in healthcare")
{:ok, results} = Query.get("Article")
|> Query.hybrid("AI trends", alpha: 0.5)
|> Query.fields(["title", "content"])
|> Query.rerank(rerank)
|> Query.execute(client)
# Access rerank scores in results
for result <- results do
score = result["_additional"]["rerankScore"]
IO.puts("Rerank score: #{score}")
end
```
**Note:** Requires a reranker module configured on the collection. See
`WeaviateEx.API.RerankerConfig` for available rerankers: `cohere`, `transformers`,
`voyageai`, `jinaai`, `nvidia`, `contextualai`.
#### gRPC Generative Search (v0.7.4+)
Generative queries now use gRPC for improved performance (~2-3x lower latency):
```elixir
alias WeaviateEx.GRPC.Services.Search
alias WeaviateEx.Query.GenerativeResult
# Build a search request with generative config
request = Search.build_near_text_request("Article", "machine learning",
limit: 5,
return_properties: ["title", "content"],
generative: %{
single_prompt: "Summarize this article: {content}",
provider: :openai,
model: "gpt-4",
temperature: 0.7
}
)
# Execute the search
{:ok, reply} = Search.execute(channel, request)
# Parse the generative results
result = GenerativeResult.from_grpc_response(reply)
# Access per-object generations
for gen <- result.generated_per_object do
IO.puts("Generated: #{gen}")
end
# Grouped generation
request = Search.build_near_text_request("Article", "AI trends",
generative: %{
grouped_task: "Synthesize the key themes from these articles",
grouped_properties: ["title", "content"],
provider: :anthropic,
model: "claude-3-5-sonnet-20241022"
}
)
{:ok, reply} = Search.execute(channel, request)
result = GenerativeResult.from_grpc_response(reply)
IO.puts("Grouped summary: #{result.generated}")
```
Supported providers: `:openai`, `:anthropic`, `:cohere`, `:mistral`, `:ollama`,
`:google`, `:aws`, `:databricks`, `:friendliai`, `:nvidia`, `:xai`, `:contextualai`, `:anyscale`.
### Multi-Vector Collections (v0.7.0+)
Query collections with multiple named vectors:
```elixir
alias WeaviateEx.Query
alias WeaviateEx.Query.TargetVectors
# Single target vector
query = Query.get("MultiVectorCollection")
|> Query.near_text("search term", target_vectors: "content_vector")
|> Query.fields(["title", "content"])
{:ok, results} = Query.execute(query, client)
# Combined vectors with average method
target = TargetVectors.combine(["title_vector", "content_vector"], method: :average)
query = Query.get("MultiVectorCollection")
|> Query.near_vector(embedding, target_vectors: target)
|> Query.fields(["title"])
{:ok, results} = Query.execute(query, client)
# Weighted combination
target = TargetVectors.weighted(%{
"title_vector" => 0.7,
"content_vector" => 0.3
})
query = Query.get("MultiVectorCollection")
|> Query.near_text("search", target_vectors: target)
|> Query.fields(["title", "content"])
{:ok, results} = Query.execute(query, client)
```
#### Updating Named Vector Configuration (v0.7.0+)
Update existing named vector index settings and quantization:
```elixir
alias WeaviateEx.API.NamedVectors
# Update vector index parameters
update = NamedVectors.update_config("title_vector",
vector_index: [
ef: 200,
dynamic_ef_min: 100,
dynamic_ef_max: 500,
dynamic_ef_factor: 8,
flat_search_cutoff: 40000
]
)
# Update with quantization settings
update = NamedVectors.update_config("content_vector",
vector_index: [ef: 150],
quantizer: [
type: :pq,
segments: 128,
centroids: 256,
training_limit: 100000
]
)
# Build update config for multiple vectors at once
updates = NamedVectors.build_update_config([
{"title_vector", [vector_index: [ef: 200]]},
{"content_vector", [quantizer: [type: :sq, rescore_limit: 200]]}
])
# Convert to API format
api_config = NamedVectors.update_to_api(update)
```
### Advanced Hybrid Search (v0.7.0+)
Use HybridVector for sophisticated hybrid queries with Move operations:
```elixir
alias WeaviateEx.Query
alias WeaviateEx.Query.{HybridVector, Move}
# Text sub-search with Move operations
hv = HybridVector.near_text("machine learning",
move_to: Move.to(0.5, concepts: ["AI", "neural networks"]),
move_away_from: Move.to(0.3, concepts: ["biology"])
)
query = Query.get("Article")
|> Query.hybrid("search term", vector: hv, alpha: 0.7)
|> Query.fields(["title", "content"])
{:ok, results} = Query.execute(query, client)
# Vector sub-search with target vectors
hv = HybridVector.near_vector(embedding, target_vectors: "content_vector")
query = Query.get("Article")
|> Query.hybrid("search", vector: hv, fusion_type: :relative_score)
|> Query.fields(["title"])
{:ok, results} = Query.execute(query, client)
```
### Multimodal Search
Search using images, audio, video, and other media types (v0.7.0+):
#### Image Search (near_image)
Search collections using image data with multi2vec-clip, multi2vec-bind, or other image vectorizers:
```elixir
alias WeaviateEx.Query
alias WeaviateEx.Query.NearImage
# Search by base64 encoded image
query = Query.get("ImageCollection")
|> Query.near_image(image: base64_image_data, certainty: 0.8)
|> Query.fields(["name", "description"])
|> Query.limit(10)
{:ok, results} = Query.execute(query, client)
# Search by image file path
query = Query.get("ImageCollection")
|> Query.near_image(image_file: "/path/to/image.png", distance: 0.3)
|> Query.fields(["name"])
{:ok, results} = Query.execute(query, client)
# With named vectors (for collections with multiple vector spaces)
query = Query.get("MultiVectorCollection")
|> Query.near_image(
image: base64_data,
certainty: 0.7,
target_vectors: ["image_vector", "clip_vector"]
)
|> Query.fields(["title"])
{:ok, results} = Query.execute(query, client)
# Using NearImage directly
near_image = NearImage.new(image: base64_data, certainty: 0.8)
NearImage.to_graphql(near_image) # => %{"image" => "...", "certainty" => 0.8}
NearImage.to_grpc(near_image) # => %{image: "...", certainty: 0.8}
# Encode image file to base64
base64_data = NearImage.encode_image_file("/path/to/image.jpg")
```
#### Media Search (near_media)
Search using audio, video, thermal, depth, or IMU data with multi2vec-bind:
```elixir
alias WeaviateEx.Query
alias WeaviateEx.Query.NearMedia
# Search by audio
query = Query.get("MediaCollection")
|> Query.near_media(:audio, media: base64_audio, certainty: 0.7)
|> Query.fields(["name", "transcript"])
|> Query.limit(5)
{:ok, results} = Query.execute(query, client)
# Search by video file
query = Query.get("MediaCollection")
|> Query.near_media(:video, media_file: "/path/to/video.mp4", distance: 0.3)
|> Query.fields(["title", "duration"])
{:ok, results} = Query.execute(query, client)
# Search by thermal imaging data
query = Query.get("SensorData")
|> Query.near_media(:thermal, media: base64_thermal, certainty: 0.8)
|> Query.fields(["timestamp", "location"])
{:ok, results} = Query.execute(query, client)
# Supported media types
NearMedia.media_types() # => [:audio, :video, :thermal, :depth, :imu]
# Using NearMedia directly
near_media = NearMedia.new(:audio, media: base64_audio, certainty: 0.7)
NearMedia.to_graphql(near_media) # => %{"media" => "...", "type" => "audio", "certainty" => 0.7}
NearMedia.to_grpc(near_media) # => %{media: "...", type: :MEDIA_TYPE_AUDIO, certainty: 0.7}
# With target vectors for named vectors
near_media = NearMedia.new(:depth,
media: base64_depth_data,
target_vectors: ["depth_vector"]
)
```
#### Convenience Methods (v0.8.0+)
For a simpler Python-like API, use the convenience methods that automatically handle
file paths, base64 data, and raw binary input:
```elixir
alias WeaviateEx.Query
# Search by image - accepts file path, base64, or binary
{:ok, results} = Query.get("Products")
|> Query.with_near_image("/path/to/image.jpg")
|> Query.limit(10)
|> Query.execute(client)
# Search by base64 image data
{:ok, results} = Query.get("Products")
|> Query.with_near_image(base64_image_data, certainty: 0.8)
|> Query.execute(client)
# Search by audio
{:ok, results} = Query.get("Podcasts")
|> Query.with_near_audio("/path/to/clip.mp3")
|> Query.execute(client)
# Search by video
{:ok, results} = Query.get("Videos")
|> Query.with_near_video("/path/to/clip.mp4")
|> Query.execute(client)
# Search by other media types
{:ok, results} = Query.get("SensorData")
|> Query.with_near_thermal(thermal_data)
|> Query.execute(client)
{:ok, results} = Query.get("DepthMaps")
|> Query.with_near_depth(depth_data, distance: 0.3)
|> Query.execute(client)
{:ok, results} = Query.get("MotionData")
|> Query.with_near_imu(imu_data)
|> Query.execute(client)
# Generic method for any media type
{:ok, results} = Query.get("Products")
|> Query.with_near_media(:image, "/path/to/image.jpg", certainty: 0.8)
|> Query.execute(client)
```
**Convenience method options:**
- `:certainty` - Minimum certainty threshold (0.0 to 1.0)
- `:distance` - Maximum distance threshold
- `:target_vectors` - Target vectors for multi-vector collections
**Supported modalities:** image, audio, video, thermal, depth, imu
**Note:** Requires a multi-modal vectorizer (e.g., `multi2vec-clip` for images,
`multi2vec-bind` for audio/video).
#### Media Type Reference
| Type | Description | Use Case |
|------|-------------|----------|
| `:audio` | Audio files (wav, mp3, etc.) | Voice search, audio similarity |
| `:video` | Video files (mp4, avi, etc.) | Video content matching |
| `:thermal` | Thermal imaging data | Industrial inspection, security |
| `:depth` | Depth sensor data | 3D object recognition |
| `:imu` | Inertial measurement unit data | Motion/gesture recognition |
### Generative Search (RAG)
Combine search with AI generation for retrieval-augmented generation:
```elixir
alias WeaviateEx.Query.Generate
# Single-object generation - generate for each result
query = Generate.new("Article")
|> Generate.near_text("artificial intelligence")
|> Generate.single("Summarize this article in one sentence: {title}")
|> Generate.return_properties(["title", "content"])
|> Generate.limit(5)
{:ok, result} = Generate.execute(query, client)
# Access generated content per object
for obj <- result.objects do
IO.puts("Title: #{obj["title"]}")
IO.puts("Generated: #{obj["_additional"]["generate"]["singleResult"]}")
end
# Grouped generation - generate once for all results combined
query = Generate.new("Article")
|> Generate.bm25("machine learning")
|> Generate.grouped("Based on these articles, what are the main trends?",
properties: ["title", "content"])
|> Generate.return_properties(["title"])
|> Generate.limit(10)
{:ok, result} = Generate.execute(query, client)
IO.puts("Combined insight: #{result.generated}")
# Hybrid search with generation
query = Generate.new("Article")
|> Generate.hybrid("neural networks", alpha: 0.7)
|> Generate.single("Extract key points from: {content}")
|> Generate.return_properties(["title", "content"])
{:ok, result} = Generate.execute(query, client)
# Convert existing Query to generative query
query = Query.get("Article")
|> Query.near_text("climate change")
|> Query.fields(["title", "content"])
|> Query.limit(5)
gen_query = Query.generate(query, :single, "Summarize: {content}")
{:ok, result} = Generate.execute(gen_query, client)
```
### Query References (v0.7.0+)
Query cross-references with multi-target support and metadata:
```elixir
alias WeaviateEx.Query.QueryReference
# Basic reference query
ref = QueryReference.new("hasAuthor", return_properties: ["name", "email"])
# Multi-target reference query (for references pointing to multiple collections)
ref = QueryReference.multi_target("relatedTo", "Article",
return_properties: ["title", "publishedAt"]
)
# Check if reference is multi-target
QueryReference.multi_target?(ref) # => true
# Request metadata in referenced objects
ref = QueryReference.new("hasAuthor",
return_properties: ["name"],
return_metadata: [:uuid, :distance, :certainty]
)
# Use metadata presets
ref = QueryReference.new("hasAuthor",
return_properties: ["name"],
return_metadata: :full # All available metadata
)
ref = QueryReference.new("hasAuthor",
return_properties: ["name"],
return_metadata: :common # uuid, distance, certainty, creation_time
)
# Use in queries
query = Query.get("Article")
|> Query.fields(["title", "content"])
|> Query.reference(ref)
```
### Aggregations
Statistical analysis over your data:
```elixir
alias WeaviateEx.API.Aggregate
alias WeaviateEx.Aggregate.Metrics
# Count all objects
{:ok, result} = Aggregate.over_all(client, "Product", metrics: [:count])
# Numeric aggregations (mean, sum, min, max)
{:ok, stats} = Aggregate.over_all(client, "Product",
properties: [{:price, [:mean, :sum, :maximum, :minimum, :count]}]
)
# Top occurrences for text fields
{:ok, categories} = Aggregate.over_all(client, "Product",
properties: [{:category, [:topOccurrences], limit: 10}]
)
# Group by with aggregations
{:ok, grouped} = Aggregate.group_by(client, "Product", "category",
metrics: [:count],
properties: [{:price, [:mean, :maximum, :minimum]}]
)
```
#### Near Object Aggregation
Aggregate objects similar to a reference object:
```elixir
# Aggregate objects near a reference UUID
{:ok, result} = Aggregate.with_near_object(client, "Articles", reference_uuid,
distance: 0.5,
metrics: [:count],
properties: [
{:views, [:mean, :sum]},
{:category, [:topOccurrences], limit: 5}
]
)
IO.inspect(result) # %{"meta" => %{"count" => 42}, "views" => %{"mean" => 1250.5, "sum" => 52521}}
```
#### Hybrid Aggregation
Aggregate with combined keyword and vector search:
```elixir
# Hybrid search aggregation (balanced keyword + vector)
{:ok, result} = Aggregate.with_hybrid(client, "Products", "electronics",
alpha: 0.5, # 50% vector, 50% keyword (default)
metrics: [:count],
properties: [
{:price, [:sum, :mean, :minimum, :maximum]}
]
)
# Pure keyword search aggregation (alpha = 0)
{:ok, result} = Aggregate.with_hybrid(client, "Products", "laptop",
alpha: 0.0,
fusion_type: :ranked,
metrics: [:count]
)
# Vector-weighted search aggregation (alpha = 0.8)
{:ok, result} = Aggregate.with_hybrid(client, "Products", "portable computer",
alpha: 0.8,
fusion_type: :relative_score,
properties: [{:category, [:topOccurrences], limit: 3}]
)
```
#### Using the Metrics Helper
Build metrics specifications with the helper module:
```elixir
alias WeaviateEx.Aggregate.Metrics
# Number metrics with all options
{:ok, result} = Aggregate.over_all(client, "Products",
metrics: [Metrics.count()],
properties: [
Metrics.number("price", sum: true, mean: true, minimum: true, maximum: true),
Metrics.text("category", top_occurrences: 5),
Metrics.boolean("inStock")
]
)
```
### Advanced Filtering
Build complex filters with a type-safe DSL:
```elixir
alias WeaviateEx.Filter
# Simple equality
filter = Filter.equal("status", "published")
# Numeric comparisons
filter = Filter.greater_than("views", 100)
filter = Filter.less_than_equal("price", 50.0)
# Text pattern matching
filter = Filter.like("title", "*AI*")
# Array operations
filter = Filter.contains_any("tags", ["elixir", "phoenix"])
filter = Filter.contains_all("tags", ["elixir", "tutorial"])
# Geospatial queries
filter = Filter.within_geo_range("location", {40.7128, -74.0060}, 5000.0)
# Date comparisons
filter = Filter.greater_than("publishedAt", "2025-01-01T00:00:00Z")
# Null checks
filter = Filter.is_null("deletedAt")
# Property length filtering (v0.7.0+)
filter = Filter.by_property_length("title", :greater_than, 10)
filter = Filter.by_property_length("tags", :greater_or_equal, 3)
# Combine filters with AND
combined = Filter.all_of([
Filter.equal("status", "published"),
Filter.greater_than("views", 100),
Filter.like("title", "*Elixir*")
])
# Combine filters with OR
or_filter = Filter.any_of([
Filter.equal("category", "technology"),
Filter.equal("category", "science")
])
# Negate filters
not_filter = Filter.none_of([
Filter.equal("status", "draft")
])
# Use in queries
query = Query.get("Article")
|> Query.where(Filter.to_graphql(combined))
|> Query.fields(["title", "views"])
```
#### Deep Reference Filtering (v0.7.0+)
Filter through chains of references to reach nested properties:
```elixir
alias WeaviateEx.Filter
alias WeaviateEx.Filter.RefPath
# Filter articles where the author's company is in technology
filter = RefPath.through("hasAuthor", "Author")
|> RefPath.through("worksAt", "Company")
|> RefPath.property("industry", :equal, "Technology")
# Filter by author name directly
filter = RefPath.through("hasAuthor", "Author")
|> RefPath.property("name", :like, "John*")
# Combine with other filters
combined = Filter.all_of([
RefPath.through("hasAuthor", "Author")
|> RefPath.property("verified", :equal, true),
Filter.equal("status", "published")
])
# Get path depth
path = RefPath.through("hasAuthor", "Author")
|> RefPath.through("worksAt", "Company")
RefPath.depth(path) # => 2
# Use convenience function
filter = Filter.by_ref_path(
RefPath.through("hasAuthor", "Author"),
"name",
:equal,
"Jane"
)
```
#### Multi-Target Reference Filtering (v0.7.0+)
Filter on multi-target reference properties that can point to different collections:
```elixir
alias WeaviateEx.Filter
alias WeaviateEx.Filter.{MultiTargetRef, RefPath}
# Filter where "relatedTo" points to an Article with specific title
filter = MultiTargetRef.new("relatedTo", "Article")
|> MultiTargetRef.where("title", :equal, "My Article")
# Filter where "mentions" points to a verified Person
filter = MultiTargetRef.new("mentions", "Person")
|> MultiTargetRef.where("verified", :equal, true)
# Deep path filtering through multi-target reference
filter = MultiTargetRef.new("mentions", "Person")
|> MultiTargetRef.deep_where(fn path ->
path
|> RefPath.through("worksAt", "Company")
|> RefPath.property("industry", :equal, "Tech")
end)
# Convert to RefPath for chaining
ref_path = MultiTargetRef.new("mentions", "Person")
|> MultiTargetRef.as_ref_path()
|> RefPath.through("worksAt", "Company")
|> RefPath.property("name", :equal, "Acme")
# Combine with other filters
combined = Filter.all_of([
MultiTargetRef.new("relatedTo", "Article")
|> MultiTargetRef.where("status", :equal, "published"),
Filter.equal("featured", true)
])
# Use convenience function
filter = Filter.by_ref_multi_target(
"relatedTo",
"Article",
"status",
:equal,
"published"
)
```
### Vector Configuration
Configure vectorizers and index types:
```elixir
alias WeaviateEx.API.VectorConfig
# Custom vectors with HNSW index
config = VectorConfig.new("AIArticle")
|> VectorConfig.with_vectorizer(:none) # Bring your own vectors
|> VectorConfig.with_hnsw_index(
distance: :cosine,
ef: 100,
max_connections: 64
)
|> VectorConfig.with_properties([
%{"name" => "title", "dataType" => ["text"]},
%{"name" => "content", "dataType" => ["text"]}
])
{:ok, _} = Collections.create(client, config)
# HNSW with Product Quantization (compression)
config = VectorConfig.new("CompressedData")
|> VectorConfig.with_vectorizer(:none)
|> VectorConfig.with_hnsw_index(distance: :dot)
|> VectorConfig.with_product_quantization(
enabled: true,
segments: 96,
centroids: 256
)
# Flat index for exact search (no approximation)
config = VectorConfig.new("ExactSearch")
|> VectorConfig.with_vectorizer(:none)
|> VectorConfig.with_flat_index(distance: :dot)
```
### Inverted Index Configuration (v0.5.0+)
Configure BM25 and stopwords for full-text search:
```elixir
alias WeaviateEx.API.InvertedIndexConfig
# Configure BM25 algorithm parameters
bm25_config = InvertedIndexConfig.bm25(b: 0.75, k1: 1.2)
# Configure stopwords with English preset and customizations
stopwords = InvertedIndexConfig.stopwords(
preset: :en,
additions: ["foo", "bar"],
removals: ["the"]
)
# Build complete inverted index configuration
config = InvertedIndexConfig.build(
bm25: [b: 0.8, k1: 1.5],
stopwords: [preset: :en],
index_timestamps: true,
index_property_length: true,
index_null_state: false,
cleanup_interval_seconds: 60
)
# Validate configuration
{:ok, validated} = InvertedIndexConfig.validate(config)
# Merge configurations
merged = InvertedIndexConfig.merge(base_config, override_config)
```
### Reranker Configuration (v0.7.0+)
Configure reranking models to improve search result relevance:
```elixir
alias WeaviateEx.API.RerankerConfig
# Cohere reranker (default or specific model)
config = RerankerConfig.cohere()
config = RerankerConfig.cohere("rerank-english-v3.0")
config = RerankerConfig.cohere("rerank-multilingual-v3.0", base_url: "https://api.cohere.ai")
# Local transformers reranker
config = RerankerConfig.transformers()
config = RerankerConfig.transformers(inference_url: "http://localhost:8080")
# Voyage AI reranker
config = RerankerConfig.voyageai("rerank-1")
config = RerankerConfig.voyageai("rerank-lite-1", base_url: "https://api.voyageai.com")
# Jina AI reranker
config = RerankerConfig.jinaai("jina-reranker-v1-base-en")
config = RerankerConfig.jinaai("jina-reranker-v1-turbo-en")
# Custom/unlisted reranker provider
config = RerankerConfig.custom("my-reranker",
api_endpoint: "https://reranker.example.com",
model: "rerank-v1",
max_tokens: 512
)
# Disable reranking
config = RerankerConfig.none()
# Use in collection creation
{:ok, _} = Collections.create("Article", %{
properties: [...],
reranker_config: config
})
```
### Custom Generative Provider Configuration (v0.7.0+)
Configure unlisted generative AI providers with custom settings:
```elixir
alias WeaviateEx.API.GenerativeConfig
# Custom generative provider for unlisted LLMs
config = GenerativeConfig.custom("my-llm",
api_endpoint: "https://llm.example.com",
model: "custom-gpt",
temperature: 0.7,
max_tokens: 2048
)
# Custom provider with authentication options
config = GenerativeConfig.custom("enterprise-llm",
api_endpoint: "https://llm.internal.corp",
model: "llm-v2",
api_key_header: "X-API-Key",
temperature: 0.5
)
# Use with collection
{:ok, _} = Collections.create("Article", %{
properties: [...],
generative_config: config
})
```
### Backup & Restore
Complete backup and restore operations with multiple storage backends:
```elixir
alias WeaviateEx.Backup.{Config, Location}
# Create a backup to filesystem
{:ok, status} = WeaviateEx.create_backup(client, "daily-backup", :filesystem)
# Create backup to S3 with specific collections and wait for completion
{:ok, status} = WeaviateEx.create_backup(client, "daily-backup", :s3,
include_collections: ["Article", "Author"],
wait_for_completion: true,
config: Config.create(compression: :best_compression)
)
# Check backup status
{:ok, status} = WeaviateEx.get_backup_status(client, "daily-backup", :filesystem)
IO.puts("Status: #{status.status}") # :started, :transferring, :success, etc.
# List all backups
{:ok, backups} = WeaviateEx.list_backups(client, :filesystem)
# Restore a backup
{:ok, status} = WeaviateEx.restore_backup(client, "daily-backup", :filesystem,
wait_for_completion: true
)
# Restore specific collections only
{:ok, status} = WeaviateEx.restore_backup(client, "daily-backup", :s3,
include_collections: ["Article"]
)
# Cancel an in-progress backup
:ok = WeaviateEx.cancel_backup(client, "daily-backup", :filesystem)
```
#### Storage Backends
| Backend | Description | Configuration |
|---------|-------------|---------------|
| `:filesystem` | Local filesystem | `BACKUP_FILESYSTEM_PATH` on server |
| `:s3` | Amazon S3 / S3-compatible | Bucket, region, credentials |
| `:gcs` | Google Cloud Storage | Bucket, project ID, credentials |
| `:azure` | Azure Blob Storage | Container, connection string |
#### Compression Options (v0.5.0+)
```elixir
alias WeaviateEx.Backup.{Config, Compression}
# GZIP compression (default)
Config.create(compression: :default) # Balanced GZIP
Config.create(compression: :best_speed) # Fast GZIP
Config.create(compression: :best_compression) # Max GZIP
# ZSTD compression (faster, better ratios)
Config.create(compression: :zstd_default) # Balanced ZSTD
Config.create(compression: :zstd_best_speed) # Fast ZSTD
Config.create(compression: :zstd_best_compression) # Max ZSTD
# No compression
Config.create(compression: :no_compression)
# Check compression type
Compression.gzip?(:default) # => true
Compression.zstd?(:zstd_default) # => true
```
#### RBAC Restore Options (v0.6.0+)
Restore backups with fine-grained control over RBAC data:
```elixir
alias WeaviateEx.Backup
# Restore with RBAC options
{:ok, status} = Backup.restore(client, "daily-backup", :s3,
roles_restore: true, # Restore role definitions
users_restore: true, # Restore user assignments
overwrite_alias: true, # Overwrite existing aliases
wait_for_completion: true
)
# Selective RBAC restore - roles only
{:ok, status} = Backup.restore(client, "daily-backup", :filesystem,
roles_restore: true,
users_restore: false
)
```
#### Location Configuration (Advanced)
Use typed location structs for cloud backend configuration:
```elixir
alias WeaviateEx.Backup.{Location, Config}
# Filesystem location
fs_loc = Location.filesystem("/var/backups/weaviate")
# S3 location with full configuration
s3_loc = Location.s3("my-bucket", "/backups",
endpoint: "s3.us-west-2.amazonaws.com",
region: "us-west-2",
access_key_id: "...",
secret_access_key: "...",
use_ssl: true
)
# GCS location
gcs_loc = Location.gcs("my-bucket", "/backups",
project_id: "my-project",
credentials: %{...}
)
# Azure location
azure_loc = Location.azure("my-container", "/backups",
connection_string: "..."
)
# Use location structs directly in backup operations
{:ok, status} = Backup.create(client, "backup-001", s3_loc,
include_collections: ["Article"],
config: Config.create(chunk_size: 128, compression: :zstd_default)
)
# Restore from location struct
{:ok, status} = Backup.restore(client, "backup-001", s3_loc,
roles_restore: true
)
```
### Collection Aliases (v0.5.0+)
Aliases allow zero-downtime collection updates by providing alternative names:
```elixir
alias WeaviateEx.API.Aliases
# Create an alias (requires Weaviate v1.32.0+)
{:ok, _} = Aliases.create(client, "articles", "Article_v1")
# List all aliases
{:ok, aliases} = Aliases.list(client)
# => [%Alias{alias: "articles", collection: "Article_v1"}]
# Update alias to point to new collection (blue-green deployment)
{:ok, _} = Aliases.update(client, "articles", "Article_v2")
# Get alias details
{:ok, alias_info} = Aliases.get(client, "articles")
# => %Alias{alias: "articles", collection: "Article_v2"}
# Check if alias exists
{:ok, true} = Aliases.exists?(client, "articles")
# Delete alias (underlying collection remains)
{:ok, true} = Aliases.delete(client, "articles")
```
### Multi-Tenancy
Isolate data per tenant with automatic partitioning:
```elixir
alias WeaviateEx.API.{VectorConfig, Tenants}
# Create multi-tenant collection
config = VectorConfig.new("TenantArticle")
|> VectorConfig.with_multi_tenancy(enabled: true)
|> VectorConfig.with_properties([
%{"name" => "title", "dataType" => ["text"]}
])
Collections.create(client, config)
# Create tenants
{:ok, created} = Tenants.create(client, "TenantArticle",
["CompanyA", "CompanyB", "CompanyC"]
)
# List all tenants
{:ok, tenants} = Tenants.list(client, "TenantArticle")
# Get specific tenant
{:ok, tenant} = Tenants.get(client, "TenantArticle", "CompanyA")
# Check existence
{:ok, true} = Tenants.exists?(client, "TenantArticle", "CompanyA")
# Deactivate tenant (set to COLD storage)
{:ok, _} = Tenants.deactivate(client, "TenantArticle", "CompanyB")
# List only active tenants
{:ok, active} = Tenants.list_active(client, "TenantArticle")
# Activate tenant (set to HOT)
{:ok, _} = Tenants.activate(client, "TenantArticle", "CompanyB")
# Count tenants
{:ok, count} = Tenants.count(client, "TenantArticle")
# Delete tenant
{:ok, _} = Tenants.delete(client, "TenantArticle", "CompanyC")
# Use tenant in queries (specify tenant parameter)
{:ok, objects} = Data.insert(client, "TenantArticle", data, tenant: "CompanyA")
```
#### Fluent with_tenant API (v0.7.4+)
Get a tenant-scoped collection reference for cleaner multi-tenant code:
```elixir
alias WeaviateEx.{Collections, TenantCollection, Query}
# Get tenant-scoped collection (matches Python client pattern)
tenant_col = Collections.with_tenant(client, "Articles", "tenant_A")
# All operations automatically scoped to tenant_A
{:ok, _} = TenantCollection.insert(tenant_col, %{
title: "My Article",
content: "Article content"
})
# Query within tenant
{:ok, results} = tenant_col
|> TenantCollection.query()
|> Query.bm25("search term")
|> Query.execute(client)
# Batch insert within tenant
{:ok, _} = TenantCollection.insert_many(tenant_col, [
%{title: "Article 1"},
%{title: "Article 2"}
])
# Get, update, delete operations
{:ok, obj} = TenantCollection.get(tenant_col, uuid)
{:ok, _} = TenantCollection.update(tenant_col, uuid, %{title: "Updated"})
{:ok, _} = TenantCollection.delete(tenant_col, uuid)
```
#### Traditional API (still supported)
```elixir
# Pass tenant as option to each operation
{:ok, _} = Objects.create("Articles", object, tenant: "tenant_A")
{:ok, _} = Query.get("Articles") |> Query.tenant("tenant_A") |> Query.execute(client)
```
### RBAC (Role-Based Access Control)
WeaviateEx provides full RBAC support for managing roles, permissions, users, and groups.
#### Creating Roles with Permissions
```elixir
alias WeaviateEx.API.RBAC
alias WeaviateEx.RBAC.Permissions
# Define permissions using the builder API
permissions = [
Permissions.collections("Article", [:read, :create]),
Permissions.data("Article", [:read, :create, :update]),
Permissions.tenants("Article", [:read])
]
# Create a role
{:ok, role} = RBAC.create_role(client, "article-editor", permissions)
# List all roles
{:ok, roles} = RBAC.list_roles(client)
# Check if role has specific permissions
{:ok, true} = RBAC.has_permissions?(client, "article-editor",
[Permissions.data("Article", :read)]
)
# Add more permissions to a role
:ok = RBAC.add_permissions(client, "article-editor",
[Permissions.nodes(:verbose)]
)
# Delete a role
:ok = RBAC.delete_role(client, "article-editor")
```
#### Role Scope Permissions (v0.6.0+)
Fine-grained permissions with collection/tenant/shard scopes:
```elixir
alias WeaviateEx.API.RBAC.{Scope, Permission}
# Create scopes for fine-grained access
scope = Scope.collection("Article")
|> Scope.with_tenants(["tenant-a", "tenant-b"])
# Or use wildcard access
all_scope = Scope.all_collections()
# Build permissions with scopes
permissions = [
Permission.read_collection("Article"),
Permission.manage_data("Article"),
Permission.new(:data, :read, scope: Scope.collection("*")),
Permission.new(:tenants, :create, scope: scope)
]
# Convenience methods for common patterns
admin_permissions = Permission.admin() # Full access
viewer_permissions = Permission.viewer() # Read-only access
```
#### Permission Types
| Type | Actions | Description |
|------|---------|-------------|
| collections | create, read, update, delete, manage | Collection schema operations |
| data | create, read, update, delete, manage | Object CRUD operations |
| tenants | create, read, update, delete | Multi-tenancy management |
| roles | create, read, update, delete | Role management |
| users | create, read, update, delete, assign_and_revoke | User management |
| groups | read, assign_and_revoke | OIDC group management |
| cluster | read | Cluster information |
| nodes | read (minimal/verbose) | Node information |
| backups | manage | Backup operations |
| replicate | create, read, update, delete | Replication management |
| alias | create, read, update, delete | Collection aliases |
### User Management
```elixir
alias WeaviateEx.API.Users
# Create a new DB user (returns API key)
{:ok, user} = Users.create(client, "john.doe")
IO.puts("API Key: #{user.api_key}")
# Get user info
{:ok, user} = Users.get(client, "john.doe")
# Get current authenticated user
{:ok, me} = Users.get_my_user(client)
# Assign roles to user
:ok = Users.assign_roles(client, "john.doe", ["article-editor", "viewer"])
# Revoke roles from user
:ok = Users.revoke_roles(client, "john.doe", ["viewer"])
# Get user's assigned roles
{:ok, roles} = Users.get_assigned_roles(client, "john.doe")
# Rotate API key
{:ok, new_key} = Users.rotate_key(client, "john.doe")
# Deactivate/activate user
:ok = Users.deactivate(client, "john.doe")
:ok = Users.activate(client, "john.doe")
# Delete user
:ok = Users.delete(client, "john.doe")
```
#### Separate DB and OIDC User Management (v0.6.0+)
For fine-grained control, use the specialized modules:
```elixir
alias WeaviateEx.API.Users.{DB, OIDC}
# Database-backed users (full lifecycle management)
{:ok, user} = DB.create(client, "db-user")
{:ok, new_key} = DB.rotate_api_key(client, "db-user")
{:ok, _} = DB.delete(client, "db-user")
# OIDC users (managed externally, role assignment only)
{:ok, users} = OIDC.list(client)
{:ok, user} = OIDC.get(client, "oidc-user@example.com")
:ok = OIDC.assign_roles(client, "oidc-user@example.com", ["viewer"])
:ok = OIDC.revoke_roles(client, "oidc-user@example.com", ["admin"])
```
### Group Management
OIDC group management for role assignments:
```elixir
alias WeaviateEx.API.Groups
# List known OIDC groups
{:ok, groups} = Groups.list_known(client)
# Assign roles to a group
:ok = Groups.assign_roles(client, "engineering", ["developer", "viewer"])
# Get roles assigned to a group
{:ok, roles} = Groups.get_assigned_roles(client, "engineering")
# Revoke roles from a group
:ok = Groups.revoke_roles(client, "engineering", ["admin"])
```
## Examples
WeaviateEx includes **8 runnable examples** that demonstrate all major features:
| Example | Description | What You'll Learn |
|---------|-------------|-------------------|
| `01_collections.exs` | Collection management | Create, list, get, add properties, delete collections |
| `02_data.exs` | CRUD operations | Insert, get, patch, check existence, delete objects |
| `03_filter.exs` | Advanced filtering | Equality, comparison, pattern matching, geo, array filters |
| `04_aggregate.exs` | Aggregations | Count, statistics, top occurrences, group by |
| `05_vector_config.exs` | Vector configuration | HNSW, PQ compression, flat index, distance metrics |
| `06_tenants.exs` | Multi-tenancy | Create tenants, activate/deactivate, list, delete |
| `07_batch.exs` | Batch API | Bulk create/delete with summaries, query remaining data |
| `08_query.exs` | Query builder | BM25 search, filters, near-vector similarity |
### Prerequisites
Follow these steps once before running any example:
1. **Start the local stack** (full profile with all compose files):
```bash
# from the project root
mix weaviate.start --version latest
# or use the helper script
./scripts/weaviate-stack.sh start --version latest
```
To shut everything down afterwards use `mix weaviate.stop --version latest` (or `./scripts/weaviate-stack.sh stop`).
2. **Confirm the services are healthy** (optional but recommended):
```bash
mix weaviate.status
```
3. **Point the client at the running cluster** (avoids repeated configuration warnings):
```bash
export WEAVIATE_URL=http://localhost:8080
# set WEAVIATE_API_KEY=... as well if your instance requires auth
```
### Running Examples
All examples are self-contained and include clean visual output:
```bash
# With WEAVIATE_URL exported
# Run any example
mix run examples/01_collections.exs
mix run examples/02_data.exs
mix run examples/03_filter.exs
# ... etc
# Or run all examples
for example in examples/*.exs; do
echo "Running $example..."
mix run "$example"
done
```
Each example:
- ✅ Checks Weaviate connectivity before running
- ✅ Shows the code being executed
- ✅ Displays formatted results
- ✅ Cleans up after itself (deletes test data)
- ✅ Provides clear success/error messages
## Supported Weaviate Versions
| Weaviate Version | Status | Notes |
|------------------|--------|-------|
| 1.35.x | Fully Supported | Latest |
| 1.34.x | Fully Supported | gRPC streaming |
| 1.33.x | Fully Supported | |
| 1.32.x | Fully Supported | |
| 1.31.x | Fully Supported | |
| 1.30.x | Fully Supported | |
| 1.29.x | Fully Supported | |
| 1.28.x | Fully Supported | |
| 1.27.x | Fully Supported | Minimum |
| < 1.27 | Not Tested | |
Testing is performed against all supported versions in CI.
## Testing
WeaviateEx has **comprehensive test coverage** with two testing modes:
### Test Modes
**Mock Mode (Default)** - Fast, isolated unit tests:
- ✅ Uses Mox to mock HTTP/Protocol and gRPC responses
- ✅ No Weaviate instance required
- ✅ Fast execution (~0.2 seconds)
- ✅ 2248+ unit tests
- ✅ Perfect for TDD and CI/CD
**Integration Mode** - Real Weaviate testing:
- ✅ Tests against live Weaviate instance
- ✅ Validates actual API behavior
- ✅ Requires Weaviate running locally
- ✅ Run with `--include integration` flag
- ✅ 10 integration test suites (collections, objects, batch, query, health, search, filter, aggregate, auth/RBAC, backup)
### Running Tests
```bash
# Run all unit tests with mocks (default - no Weaviate needed)
mix test
# EASIEST: Run integration tests with automatic Weaviate management
mix weaviate.test # Starts Weaviate, runs tests, stops Weaviate
mix weaviate.test --keep # Keep Weaviate running after tests
mix weaviate.test -v 1.30.5 # Test against specific Weaviate version
# MANUAL: Run integration tests with separate Weaviate management
mix weaviate.start # Start Weaviate containers
mix test --include integration # Run integration tests
mix weaviate.stop # Stop Weaviate containers
# Or use environment variable
WEAVIATE_INTEGRATION=true mix test --include integration
# Run specific test file
mix test test/weaviate_ex/api/collections_test.exs
# Run specific test by line number
mix test test/weaviate_ex/objects_test.exs:95
# Run with coverage report (basic)
mix test --cover
# Run with coverage report (detailed HTML via excoveralls)
mix coveralls.html
open cover/excoveralls.html
# Run only integration tests
mix test --only integration
# Run specific integration test suites
mix test --only integration test/integration/search_integration_test.exs
mix test --only rbac # RBAC tests (requires port 8092)
mix test --only backup # Backup tests (requires port 8093)
```
### Test Structure
```
test/
├── test_helper.exs # Test setup, Mox configuration
├── support/
│ ├── factory.ex # Test data factories
│ ├── mocks.ex # Mox mock definitions
│ └── integration_case.ex # Shared integration test module
├── weaviate_ex_test.exs # Top-level API tests
├── weaviate_ex/
│ ├── api/ # API module tests (mocked)
│ │ ├── collections_test.exs
│ │ ├── data_test.exs
│ │ ├── aggregate_test.exs
│ │ ├── tenants_test.exs
│ │ └── ...
│ ├── filter_test.exs # Filter system tests
│ ├── objects_test.exs # Objects API tests
│ ├── batch_test.exs # Batch operations tests
│ └── query_test.exs # Query builder tests
├── integration/ # Integration tests (live Weaviate)
│ ├── collections_integration_test.exs # Collection CRUD
│ ├── objects_integration_test.exs # Object CRUD
│ ├── batch_integration_test.exs # Batch operations
│ ├── query_integration_test.exs # Query execution
│ ├── health_integration_test.exs # Health checks
│ ├── search_integration_test.exs # BM25, near_vector, pagination
│ ├── filter_integration_test.exs # Filter operators, AND/OR
│ ├── aggregate_integration_test.exs # Aggregations, group by
│ ├── auth_integration_test.exs # RBAC, API key auth (port 8092)
│ └── backup_integration_test.exs # Backup/restore (port 8093)
└── journey/ # Web framework journey tests
├── scenarios.ex # Shared journey test scenarios
├── scenarios_test.exs # Direct scenario tests
├── phoenix_test.exs # Phoenix endpoint integration
└── plug_test.exs # Plug router integration
```
### Integration Test Helper
Use `WeaviateEx.IntegrationCase` for consistent test setup:
```elixir
defmodule MyIntegrationTest do
use WeaviateEx.IntegrationCase # Auto-configures HTTP client, cleanup
test "my integration test" do
# Unique collection names with automatic cleanup
{name, {:ok, _}} = create_test_collection("MyTest", properties: [...])
# Or use scoped collections
with_collection([prefix: "Scoped"], fn name ->
# Collection exists only within this block
end)
end
end
```
### Journey Tests
Journey tests validate WeaviateEx integration with Phoenix and Plug web frameworks. These tests ensure the SDK works correctly when:
- Initialized at application startup and closed at shutdown
- Used from both synchronous and asynchronous contexts (different processes)
- Handling concurrent requests from multiple web requests
- Managing connection lifecycle within web framework patterns
```bash
# Start Weaviate
mix weaviate.start
# Run journey tests
WEAVIATE_INTEGRATION=true mix test --include journey
# Or run all integration tests including journey
WEAVIATE_INTEGRATION=true mix test --include integration --include journey
# Stop Weaviate
mix weaviate.stop
```
See `test/journey/` for Phoenix and Plug integration examples:
- `test/journey/scenarios.ex` - Shared journey test scenarios
- `test/journey/scenarios_test.exs` - Direct scenario tests
- `test/journey/phoenix_test.exs` - Phoenix endpoint integration
- `test/journey/plug_test.exs` - Plug router integration
### Test Coverage
Current test coverage by module:
- ✅ **Collections API**: 17 tests - Create, list, get, exists, delete, add property
- ✅ **Filter System**: 80+ tests - All operators, combinators, RefPath, MultiTargetRef, property length
- ✅ **Data Operations**: 17 tests - Insert, get, patch, exists, delete with vectors
- ✅ **Objects API**: 15+ tests - Full CRUD with pagination
- ✅ **Batch Operations**: 35+ tests - Bulk create, delete, error tracking, retry logic
- ✅ **Query System**: 60+ tests - GraphQL, near_text, hybrid, BM25, move, rerank, groupBy
- ✅ **Aggregations**: 15+ tests - Count, statistics, group by
- ✅ **Tenants**: 20+ tests - Multi-tenancy with freeze/offload states
- ✅ **References**: 30+ tests - Cross-reference CRUD, multi-target references, QueryReference metadata
- ✅ **Generative AI**: 62 tests - All providers, typed configs, result parsing
- ✅ **Vector Config**: 15+ tests - HNSW, PQ, flat index, multi-vector
- ✅ **Multi-Vector**: 10+ tests - ColBERT, Muvera encoding, Jina vectorizers
- ✅ **gRPC Services**: 50+ tests - Channel management, search, batch, aggregate, tenants, health
- ✅ **gRPC Error Handling**: 30+ tests - Status code mapping, retryable errors
- ✅ **Generative Search**: 25+ tests - Query.Generate, all search types, GraphQL generation
- ✅ **Nested Properties**: 25+ tests - Property.Nested struct, serialization, validation
- ✅ **Concurrent Batch**: 20+ tests - Parallel insertion, result aggregation
- ✅ **Batch Queue**: 25+ tests - Queue operations, failure tracking, re-queue
- ✅ **Rate Limit Detection**: 20+ tests - Provider patterns, backoff calculation
- ✅ **Custom Providers**: 20+ tests - Custom generative configs, reranker configurations
**Total: 2362 tests passing**
## Mix Tasks
WeaviateEx provides Mix tasks for managing local Weaviate Docker containers:
| Task | Description |
|------|-------------|
| `mix weaviate.start` | Start Weaviate Docker containers |
| `mix weaviate.stop` | Stop Weaviate Docker containers |
| `mix weaviate.status` | Show container status and health check |
| `mix weaviate.test` | Start Weaviate, run integration tests, stop Weaviate |
| `mix weaviate.logs` | Show Docker container logs |
```bash
# Start Weaviate containers (default version: 1.28.14)
mix weaviate.start
mix weaviate.start --version 1.30.5 # Specific version
mix weaviate.start -v latest # Latest version
# Check container status and health
mix weaviate.status
# Stop all Weaviate containers
mix weaviate.stop
mix weaviate.stop --keep-data # Preserve data directory
# Run integration tests (full lifecycle management)
mix weaviate.test # Start, test, stop
mix weaviate.test --keep # Keep Weaviate running after tests
mix weaviate.test -v 1.30.5 # Test against specific version
# View container logs
mix weaviate.logs # Show last 100 lines
mix weaviate.logs --tail 50 # Show last 50 lines
mix weaviate.logs --file docker-compose-backup.yml # Specific compose file
mix weaviate.logs -f --file docker-compose.yml # Follow logs
```
The tasks shell out to scripts in `ci/` which manage multiple Docker Compose profiles (single node, RBAC, backup, cluster, async, etc.).
## Development Tools
### Benchmarks
Run performance benchmarks with Benchee:
```bash
# Start Weaviate first
mix weaviate.start
# Run all benchmarks
mix weaviate.bench
# Run specific benchmark
mix weaviate.bench batch # Batch insert performance
mix weaviate.bench query # Query performance (near_vector, BM25, hybrid)
```
Results are saved to `bench/output/` as HTML files with detailed statistics and charts.
### Pre-commit Hooks
Install pre-commit hooks for automatic code quality checks:
```bash
# Install pre-commit (Python package)
pip install pre-commit
# Or with Homebrew
brew install pre-commit
# Install hooks
pre-commit install
# Run on all files
pre-commit run --all-files
```
Hooks automatically run `mix format`, `mix compile --warnings-as-errors`, and `mix credo --strict` before each commit.
### Profiling
See [guides/profiling.md](guides/profiling.md) for profiling techniques using Elixir's built-in tools (fprof, eprof, cprof).
## Docker Management
### Using the bundled scripts
All Compose profiles live under `ci/` (ported from the Python client). The shell scripts manage multiple configurations:
```bash
# Start all profiles (single node, modules, RBAC, cluster, async, proxy, backup)
./ci/start_weaviate.sh 1.28.14
# Async-only sandbox for journey tests
./ci/start_weaviate_jt.sh 1.28.14
# Stop all containers
./ci/stop_weaviate.sh
```
Edit `ci/compose.sh` to add/remove compose files from the managed set.
### Available Docker Compose Profiles
| File | Port(s) | Description |
|------|---------|-------------|
| `docker-compose.yml` | 8080, 50051 | Primary single-node instance |
| `docker-compose-rbac.yml` | 8092 | RBAC-enabled instance |
| `docker-compose-backup.yml` | 8093 | Backup-enabled instance |
| `docker-compose-cluster.yml` | 8087-8089 | 3-node cluster |
| `docker-compose-async.yml` | 8090 | Async/journey test instance |
| `docker-compose-modules.yml` | 8091 | Module-enabled instance |
| `docker-compose-proxy.yml` | 8094 | Proxy configuration |
### Direct Docker Compose commands
```bash
# Spawn just the baseline stack
docker compose -f ci/docker-compose.yml up -d
# Inspect the cluster nodes
docker compose -f ci/docker-compose-cluster.yml ps
# Tail logs for the RBAC profile
docker compose -f ci/docker-compose-rbac.yml logs -f
# Remove everything (data included)
docker compose -f ci/docker-compose.yml down -v
```
### Troubleshooting tips
```bash
# Confirm Docker is running
docker info
# See which services are up for a given profile
docker compose -f ci/docker-compose-backup.yml ps -a
# Check the ready endpoint of the primary instance
curl http://localhost:8080/v1/.well-known/ready
# Query metadata
curl http://localhost:8080/v1/meta
```
## Authentication
For **production or cloud Weaviate instances** with authentication:
### Environment Variables (Recommended)
```bash
# Add to .env file (NOT committed to git)
WEAVIATE_URL=https://your-cluster.weaviate.network
WEAVIATE_API_KEY=your-secret-api-key-here
# Or add to ~/.bash_secrets (sourced by ~/.bashrc)
export WEAVIATE_URL=https://your-cluster.weaviate.network
export WEAVIATE_API_KEY=your-secret-api-key-here
```
### Runtime Configuration (Production)
```elixir
# config/runtime.exs
config :weaviate_ex,
url: System.fetch_env!("WEAVIATE_URL"),
api_key: System.fetch_env!("WEAVIATE_API_KEY"),
strict: true # Fail fast if unreachable
```
### Development Configuration
```elixir
# config/dev.exs (NEVER commit production keys!)
config :weaviate_ex,
url: "http://localhost:8080",
api_key: nil # No auth for local development
```
### Client Auth Helpers (API Key / OIDC)
Configure auth directly in the client for per-connection credentials and automatic OIDC refresh:
```elixir
alias WeaviateEx.Auth
# API key
{:ok, client} =
WeaviateEx.Client.connect(
base_url: "https://your-cluster.weaviate.network",
auth: Auth.api_key("your-secret-api-key")
)
# OIDC client credentials (auto-refresh)
auth = Auth.client_credentials("client-id", "client-secret", scopes: ["openid", "profile"])
{:ok, client} =
WeaviateEx.Client.connect(
base_url: "https://your-cluster.weaviate.network",
auth: auth
)
# Skip init checks if needed
{:ok, client} =
WeaviateEx.Client.connect(
base_url: "https://your-cluster.weaviate.network",
auth: auth,
skip_init_checks: true
)
```
OIDC access tokens are refreshed automatically and applied to HTTP headers and gRPC metadata.
**Security Best Practices:**
- ✅ Never commit API keys to version control
- ✅ Use environment variables for production
- ✅ Add `.env` to `.gitignore` (already done)
- ✅ Use `System.fetch_env!/1` to fail fast on missing keys
- ✅ Store production secrets in secure vaults (e.g., AWS Secrets Manager)
- ✅ Use different keys for dev/staging/production
## Connection Management
### Connecting to Weaviate Cloud (v0.7.4+)
WeaviateEx provides full support for Weaviate Cloud Service (WCS) with automatic configuration:
```elixir
alias WeaviateEx.Connect
# Connect to Weaviate Cloud with API key
config = Connect.to_weaviate_cloud(
cluster_url: "my-cluster.weaviate.network",
api_key: "your-wcs-api-key"
)
{:ok, client} = WeaviateEx.Client.connect(
base_url: config.base_url,
grpc_host: config.grpc_host,
grpc_port: config.grpc_port,
api_key: config.api_key,
additional_headers: Map.new(config.headers)
)
```
**Automatic WCS Features:**
- **gRPC Host Detection**: `.weaviate.network` clusters use `{ident}.grpc.{domain}` pattern
- **X-Weaviate-Cluster-URL Header**: Automatically added for embedding service integration
- **TLS/Port 443**: HTTPS and gRPC-TLS enforced for cloud clusters
```elixir
# Different WCS domains are handled correctly:
Connect.to_weaviate_cloud(cluster_url: "my-cluster.weaviate.network")
# gRPC host: my-cluster.grpc.weaviate.network
Connect.to_weaviate_cloud(cluster_url: "my-cluster.aws.weaviate.cloud")
# gRPC host: grpc-my-cluster.aws.weaviate.cloud
```
### Server Version Requirements
WeaviateEx requires Weaviate server version **1.27.0 or higher**. The client validates the server version on connection.
```elixir
# Version check happens automatically during connect
{:ok, client} = WeaviateEx.Client.connect(
base_url: "http://localhost:8080"
)
# To bypass version checks (not recommended)
{:ok, client} = WeaviateEx.Client.connect(
base_url: "http://localhost:8080",
skip_init_checks: true
)
```
When connecting to an unsupported version, you'll receive a clear error:
```
Weaviate server version 1.20.0 is below minimum required 1.27.0
```
### Connection Pool Configuration (v0.6.0+)
Configure HTTP and gRPC connection pools for optimal performance:
```elixir
alias WeaviateEx.Client.Pool
# Create custom pool configuration
http_pool = Pool.new(
size: 20, # Number of connections in pool
overflow: 10, # Maximum overflow connections
strategy: :lifo, # Connection selection (:fifo or :lifo)
timeout: 5000, # Checkout timeout in ms
idle_timeout: 60_000, # Idle connection timeout in ms
max_age: nil # Max connection age (nil = no limit)
)
# Use preset configurations
http_pool = Pool.default_http() # Optimized for HTTP/Finch
grpc_pool = Pool.default_grpc() # Optimized for gRPC (fewer connections)
# Convert to client options
finch_opts = Pool.to_finch_opts(http_pool)
grpc_opts = Pool.to_grpc_opts(grpc_pool)
```
### Simplified Connection Config (v0.7.0+)
For high-load scenarios, use the new Connection config:
```elixir
alias WeaviateEx.Config.Connection
# Create connection config with custom settings
config = Connection.new(
pool_size: 20, # Connections per pool
max_connections: 200, # Maximum total connections
pool_timeout: 10_000, # Pool checkout timeout (ms)
max_idle_time: 60_000 # Max idle time before close (ms)
)
# Use in client creation
{:ok, client} = WeaviateEx.Client.connect(
base_url: "http://localhost:8080",
connection: config
)
# Or pass options directly
{:ok, client} = WeaviateEx.Client.connect(
base_url: "http://localhost:8080",
connection: [pool_size: 20, max_connections: 200]
)
```
### Proxy Configuration (v0.7.3+)
Use proxy settings for HTTP, HTTPS, and gRPC connections:
```elixir
alias WeaviateEx.Config.Proxy
{:ok, client} =
WeaviateEx.Client.connect(
base_url: "https://your-cluster.weaviate.network",
proxy: Proxy.new(
http: "http://proxy.example.com:8080",
https: "https://proxy.example.com:8443",
grpc: "http://grpc-proxy.example.com:8080"
)
)
# Or read from HTTP_PROXY / HTTPS_PROXY / GRPC_PROXY
{:ok, client} =
WeaviateEx.Client.connect(
base_url: "https://your-cluster.weaviate.network",
proxy: :env
)
```
### HTTP Retry Configuration (v0.7.4+)
WeaviateEx automatically retries failed HTTP requests with exponential backoff and jitter.
Retries are triggered for both transport errors (network issues) and transient HTTP status codes.
**Retryable errors:**
- Transport: connection refused, reset, timeout, closed, DNS failure
- HTTP status codes: 408, 429, 500, 502, 503, 504
```elixir
# Configure retry options when creating a client
{:ok, client} = WeaviateEx.Client.connect(
base_url: "http://localhost:8080",
retry: [
max_retries: 3, # Maximum retry attempts (default: 3)
base_delay_ms: 100, # Base delay for exponential backoff (default: 100)
max_delay_ms: 5000 # Maximum delay cap (default: 5000)
]
)
# Or override per-request
{:ok, data} = WeaviateEx.API.Data.get(client, "Article", uuid,
max_retries: 5,
base_delay_ms: 200,
max_delay_ms: 10000
)
```
**Backoff strategy:**
- Uses exponential backoff: `delay = base_delay_ms × 2^attempt`
- Adds ±10% random jitter to prevent thundering herd
- Capped at `max_delay_ms`
Example delays with defaults (base=100ms, max=5000ms):
- Attempt 0: ~100ms
- Attempt 1: ~200ms
- Attempt 2: ~400ms
- Attempt 3: ~800ms
### Per-Operation Timeouts (v0.7.4+)
WeaviateEx uses different timeouts based on operation type:
| Operation Type | Default Timeout | Description |
|----------------|-----------------|-------------|
| Query/GET | 30 seconds | Search, read operations |
| Insert/POST | 90 seconds | Write, update operations |
| Batch | 900 seconds | Batch operations (insert × 10) |
| Init | 2 seconds | Connection initialization |
```elixir
# Configure timeouts in client
{:ok, client} = WeaviateEx.Client.connect(
base_url: "http://localhost:8080",
timeout_config: WeaviateEx.Config.Timeout.new(
query: 60_000, # 60 seconds for queries
insert: 180_000, # 180 seconds for inserts
init: 5_000 # 5 seconds for init
)
)
# Override per-request
{:ok, data} = WeaviateEx.API.Data.get(client, "Article", uuid,
timeout: 60_000 # Explicit timeout override
)
# Specify operation type for automatic timeout selection
{:ok, result} = WeaviateEx.API.Batch.create_objects(client, objects,
operation: :batch # Uses extended batch timeout
)
```
### Client Lifecycle Management (v0.6.0+)
Manage client connections with explicit lifecycle control:
```elixir
alias WeaviateEx.Client
# Create and use a client
{:ok, client} = Client.new(base_url: "http://localhost:8080")
# Check client status
Client.status(client) # => :connected, :initializing, :disconnected, :closed
# Check if client is closed
Client.closed?(client) # => false
# Get client statistics
stats = Client.stats(client)
IO.puts("Requests: #{stats.request_count}")
IO.puts("Errors: #{stats.error_count}")
IO.puts("Created: #{stats.created_at}")
# Close the client when done
:ok = Client.close(client)
Client.closed?(client) # => true
```
### Resource Management with `with_client/2`
Automatic client lifecycle management with guaranteed cleanup:
```elixir
alias WeaviateEx.Client
# with_client ensures cleanup even on errors
result = Client.with_client([base_url: "http://localhost:8080"], fn client ->
# Use client for operations
{:ok, meta} = WeaviateEx.health_check(client)
{:ok, collections} = WeaviateEx.Collections.list(client)
# Return your result
{:ok, %{version: meta["version"], collections: length(collections)}}
end)
# Client is automatically closed after the function returns
case result do
{:ok, data} -> IO.puts("Version: #{data.version}")
{:error, reason} -> IO.puts("Error: #{inspect(reason)}")
end
# Even if the function raises, client is closed
try do
Client.with_client([base_url: url], fn client ->
raise "Something went wrong"
end)
rescue
e -> IO.puts("Caught: #{e.message}")
# Client was still properly closed
end
```
## Debug & Troubleshooting
### Debug Module (v0.6.0+)
Compare REST and gRPC protocol responses for debugging:
```elixir
alias WeaviateEx.Debug
# Get an object via REST (HTTP)
{:ok, rest_obj} = Debug.get_object_rest(client, "Article", uuid)
{:ok, rest_obj} =
Debug.get_object_rest(client, "Article", uuid,
node_name: "node-1",
consistency_level: "ALL"
)
# Get the same object via gRPC
{:ok, grpc_obj} = Debug.get_object_grpc(client, "Article", uuid)
# Compare both protocols and get a detailed diff
{:ok, comparison} = Debug.compare_protocols(client, "Article", uuid)
# Check comparison results
comparison.match? # => true or false
comparison.rest_object # => %{...}
comparison.grpc_object # => %{...}
comparison.differences # => [] or list of differences
# Get connection diagnostics
{:ok, info} = Debug.connection_info(client)
IO.puts("HTTP Base URL: #{info.http_base_url}")
IO.puts("gRPC Connected: #{info.grpc_connected}")
IO.puts("gRPC Host: #{info.grpc_host}:#{info.grpc_port}")
```
### Object Comparison
Deep comparison of objects from different sources:
```elixir
alias WeaviateEx.Debug.ObjectCompare
# Compare two objects
result = ObjectCompare.compare(rest_object, grpc_object)
result.match? # => true if objects are equivalent
result.differences # => list of differences found
# Get a formatted diff report
diff_list = ObjectCompare.diff(rest_object, grpc_object)
report = ObjectCompare.format_diff(diff_list)
IO.puts(report)
# Output:
# - properties.title: "REST Title" vs "gRPC Title"
# - _additional.vector: [0.1, 0.2, ...] vs [0.1, 0.2, ...]
```
### Request Logging
Log and analyze HTTP/gRPC requests for debugging:
```elixir
alias WeaviateEx.Debug.RequestLogger
# Start the request logger
{:ok, logger} = RequestLogger.start_link(name: :my_logger)
# Enable logging
RequestLogger.enable(logger)
# Log requests manually or via middleware
RequestLogger.log_request(logger, %{
method: :get,
path: "/v1/schema",
protocol: :http,
duration_ms: 45,
status: 200
})
# Get recent logs
logs = RequestLogger.get_logs(logger)
for log <- logs do
IO.puts("#{log.protocol} #{log.method} #{log.path} - #{log.status} (#{log.duration_ms}ms)")
end
# Filter logs
http_logs = RequestLogger.get_logs(logger, protocol: :http)
slow_logs = RequestLogger.get_logs(logger, min_duration_ms: 100)
# Export logs for analysis
RequestLogger.export_logs(logger, "/tmp/weaviate_requests.json", :json)
RequestLogger.export_logs(logger, "/tmp/weaviate_requests.txt", :text)
# Clear logs
RequestLogger.clear_logs(logger)
# Disable when done
RequestLogger.disable(logger)
```
### Main Module Debug Helpers
Quick access to debug functions from the main module:
```elixir
# Get object via REST
{:ok, obj} = WeaviateEx.debug_get_rest(client, "Article", uuid)
# Compare protocols
{:ok, comparison} = WeaviateEx.debug_compare(client, "Article", uuid)
```
## Documentation
- **[INSTALL.md](INSTALL.md)** - Detailed installation guide for all platforms
- **[CHANGELOG.md](CHANGELOG.md)** - Version history and release notes
- **[API Documentation](https://hexdocs.pm/weaviate_ex)** - Full API reference on HexDocs
- **[Weaviate Docs](https://docs.weaviate.io)** - Official Weaviate documentation
- **Examples** - 8 runnable examples in the GitHub repository (see [Examples](#examples) section)
### Building Documentation Locally
```bash
# Generate docs
mix docs
# Open in browser (macOS)
open doc/index.html
# Open in browser (Linux)
xdg-open doc/index.html
```
## Development
```bash
# Clone the repository
git clone https://github.com/yourusername/weaviate_ex.git
cd weaviate_ex
# Install dependencies
mix deps.get
# Compile
mix compile
# Run unit tests (mocked - fast)
mix test
# Run integration tests (requires live Weaviate)
mix weaviate.start
mix test --include integration
# Generate documentation
mix docs
# Run code analysis
mix credo
# Run type checking (if dialyzer is set up)
mix dialyzer
# Format code
mix format
```
### Project Structure
```
weaviate_ex/
├── ci/
│ └── weaviate/ # Docker assets mirrored from Python client
│ ├── compose.sh
│ ├── start_weaviate.sh
│ ├── docker-compose.yml
│ └── docker-compose-*.yml
├── priv/
│ └── protos/v1/ # Weaviate gRPC proto definitions
│ ├── weaviate.proto
│ ├── batch.proto
│ ├── search_get.proto
│ └── ...
├── lib/
│ ├── weaviate_ex.ex # Top-level API
│ ├── weaviate_ex/
│ │ ├── embedded.ex # Embedded binary lifecycle manager
│ │ ├── dev_support/ # Internal tooling (compose helper)
│ │ ├── application.ex # OTP application
│ │ ├── client.ex # Client struct & config
│ │ ├── config.ex # Configuration management
│ │ ├── error.ex # Error types (HTTP + gRPC)
│ │ ├── filter.ex # Filter DSL
│ │ ├── api/ # API modules
│ │ │ ├── collections.ex
│ │ │ ├── data.ex
│ │ │ ├── aggregate.ex
│ │ │ ├── tenants.ex
│ │ │ └── vector_config.ex
│ │ ├── grpc/ # gRPC infrastructure
│ │ │ ├── channel.ex # Channel management
│ │ │ ├── services/ # gRPC service clients
│ │ │ │ ├── search.ex
│ │ │ │ ├── batch.ex
│ │ │ │ ├── aggregate.ex
│ │ │ │ ├── tenants.ex
│ │ │ │ └── health.ex
│ │ │ └── generated/v1/ # Proto-generated modules
│ │ └── ...
│ └── mix/
│ └── tasks/
│ ├── weaviate.start.ex
│ ├── weaviate.stop.ex
│ ├── weaviate.status.ex
│ └── weaviate.logs.ex
├── test/ # Test suite
├── examples/ # Runnable examples (in source repo)
├── install.sh # Legacy single-profile bootstrap
└── mix.exs # Project configuration
```
## Contributing
Contributions are welcome! Here's how you can help:
1. **Fork the repository**
2. **Create a feature branch**: `git checkout -b feature/amazing-feature`
3. **Write tests**: All new features should include tests
4. **Run tests**: `mix test` (should pass)
5. **Run integration tests**: `mix weaviate.test` (optional but recommended)
6. **Run Credo**: `mix credo` (should pass)
7. **Commit changes**: `git commit -m 'Add amazing feature'`
8. **Push to branch**: `git push origin feature/amazing-feature`
9. **Open a Pull Request**
### CI/CD Pipeline
Pull requests automatically run the following GitHub Actions jobs:
| Job | Description |
|-----|-------------|
| `format-and-lint` | Code formatting and Credo linting |
| `unit-tests` | 2300+ unit tests with Mox mocking + Dialyzer |
| `integration-tests` | Integration tests against Weaviate 1.28.14 |
| `integration-matrix` | Tests against Weaviate 1.27, 1.28, 1.29, 1.30 (master/tags only) |
### Development Guidelines
- Write tests first (TDD approach)
- Maintain test coverage above 90%
- Follow Elixir style guide
- Add typespecs for public functions
- Update documentation for API changes
- Add examples for new features
- For API changes, add integration tests in `test/integration/`
## License
MIT License. See [LICENSE](LICENSE) for details.
## Acknowledgments
- Built for [Weaviate](https://weaviate.io) vector database
- Inspired by official Python and TypeScript clients
- Uses [grpc-elixir](https://github.com/elixir-grpc/grpc) for high-performance gRPC operations
- Uses [Finch](https://github.com/sneako/finch) for HTTP/2 connection pooling (schema operations)
- Powered by Elixir and the BEAM VM
---
**Questions or Issues?** Open an issue on [GitHub](https://github.com/yourusername/weaviate_ex/issues)