Eldric Cluster Architecture

Complete reference for distributed AI infrastructure with database connectivity

v4.2.0

Global Cluster Layout

Enterprise Multi-Region Deployment

Node Types

Controller

Port 8880

Cluster orchestration and management. Handles node registration, job scheduling, license validation, and provides the management dashboard.

Tiers: Primary, Secondary, Tertiary

Binary: eldric-controller

Router

Port 8881

Intelligent request routing with AI-powered decisions. Routes requests to appropriate workers based on content, model requirements, and load.

Features: Theme detection, model specialization, load balancing, streaming support

Binaries:

eldric-routerd - Dedicated router daemon (recommended)
eldric-controller --router - Router mode in controller

AI Worker

Port 8890

Handles AI inference requests via multiple backends: Ollama, vLLM, TGI, llama.cpp, Triton, TensorFlow Serving.

Tiers: Core (GPU), Standard, Edge

Binary: eldric-workerd

Cloud Worker

Port 8889

Multi-backend cloud inference gateway. Manages multiple cloud API backends simultaneously (xAI/Grok, OpenAI, Anthropic, DeepSeek, Groq, etc.) with priority-based routing and fallback.

Features: Auto-discovers models from connected backends, health monitoring (60s interval), OpenAI-compatible API with streaming SSE

Registration: worker_type: "cloud" with cld- ID prefix

Binary: eldric-cloudd

Data Worker

Port 8892

Unified data service for storage, database connectivity, and vector/RAG. Supports PostgreSQL, MySQL, SQLite, IBM DB2.

Features: NFS server, connection pooling, schema discovery, vector storage

Binary: eldric-datad

Agent Worker

Port 8893

Agentic RAG orchestration with multi-tenant support. Query decomposition, multi-agent execution, training data generation.

Features: ReAct pattern, distributed RAG, workflow engine

Binary: eldric-agentd

Media Worker

Port 8894

Audio/video processing with STT (Whisper.cpp, OpenAI), TTS (Piper, ElevenLabs), video transcription, and multimedia RAG.

Features: Real-time streaming, voice chat, scene detection

Binary: eldric-mediad

Communication Worker

Port 8895

Messaging protocol daemon for Email (IMAP/SMTP), SMS (Twilio), WhatsApp (Business API), Signal (E2E encrypted), Microsoft Teams (Graph API), and XMPP/Jabber.

Features: AI auto-response with approval workflow, unified message format, semantic search, webhook handling on port 8896

Binary: eldric-commd

IIoT Worker

Port 8896

Industrial IoT integration with OPC UA, Modbus, MQTT protocols. Real-time sensor data, PLC control, OEE monitoring.

Features: Smart home, industrial automation, AI edge inference

Binary: eldric-iiotd

Science Worker

Port 8897

Scientific computing daemon for bioinformatics, pharmaceutical research, CRISPR design, and LIMS integration.

Features: 140+ REST APIs, external DB integration (NCBI, UniProt, ChEMBL)

Binary: eldric-scienced

Training Worker

Port 8898

AI model training daemon with LoRA, QLoRA, SFT, DPO, RLHF support. MLX, Unsloth, TRL, DeepSpeed backends.

Features: Training chains, latent reasoning, multi-GPU support

Binary: eldric-traind

Edge Server

Port 443

Unified gateway that proxies all external traffic to every worker type through a single TLS endpoint with API key auth and rate limiting.

Features: Unified proxy for all workers, farm mode HA, auto-discovery, web chat client, Canvas A2UI

Binary: eldric-edge

Swarm Orchestrator

Port 8885-8887

Multi-agent orchestration with tiered swarm hierarchy for autonomous goal execution.

Features: Task decomposition, agent coordination, UAP protocol, MCP support

Binary: eldric-swarmd

Deployment Scenarios

1. Development Setup

Single machine for local development and testing

2. Production Multi-Region

Global deployment with regional controllers

3. Enterprise Mainframe

IBM z/OS DB2 connectivity via DRDA

4. Knowledge Routing

AI-powered theme detection and model selection

5. Swarm Multi-Agent Orchestration

Autonomous goal execution with tiered agent hierarchy

Use Cases

Enterprise AI Analytics Platform

User Query

→

Edge/CDN

→

Router
Theme Detection

→

AI Worker
SQL Generation

→

Data Worker
Query DB

→

AI Worker
Analyze Results

Multi-Database RAG Pipeline

Combine data from multiple enterprise databases for AI-powered analysis:

# AI Worker queries multiple data sources
Query PostgreSQL for customer data
Query MySQL for transaction history
Query DB2 z/OS for mainframe records
AI synthesizes all data into unified response
            

Real-time Business Intelligence

Dashboard Request

→

Controller API

→

Data Workers

→

DB Cluster

→

AI Summary

Voice-Enabled AI Assistant

End-to-end voice chat using Media Worker for STT/TTS:

Voice Input
Audio Stream

→

Media Worker
STT (Whisper)

→

AI Worker
LLM Response

→

Media Worker
TTS (Piper)

→

Voice Output
Audio Stream

Video Meeting Summarization

Transcribe and analyze recorded meetings:

# Video processing pipeline
Media Worker extracts audio from video
Media Worker transcribes with speaker diarization
Data Worker stores transcript with vector embeddings
Agent Worker generates meeting summary and action items
Comm Worker distributes summary via email/Teams
            

Unified Customer Communication

AI-powered multi-channel messaging with Comm Worker:

Customer
WhatsApp/Email/SMS

→

Comm Worker
Protocol Adapter

→

Data Worker
Message RAG

→

AI Worker
Draft Response

→

Comm Worker
Send Reply

Agentic Knowledge Search

Complex question answering with iterative retrieval:

# Agent Worker performs multi-hop reasoning
Agent Worker receives complex query
Decomposes into sub-questions (Query Decomposition)
Retrieves relevant documents from Data Worker (RAG)
Iteratively refines search based on findings (ReAct)
Synthesizes final answer with citations
            

Quick Reference

Start a Complete Cluster

# 1. Controller (management)
./eldric-controller -p 8880

# 2. Router - Option A: Dedicated daemon (recommended)
./eldric-routerd -p 8881 -c http://localhost:8880 --ai-routing

# 2. Router - Option B: Controller router mode
./eldric-controller --router -p 8881 -c http://localhost:8880

# 3. AI Workers (local inference)
./eldric-workerd -p 8890 -c http://localhost:8880

# 3b. Cloud Worker (multi-backend cloud inference)
./eldric-cloudd -p 8889 -c http://localhost:8880

# 4. Data Worker (storage, databases, vector)
./eldric-datad -p 8892 -c http://localhost:8880 --vector --nfs

# 5. Agent Worker (agentic RAG)
./eldric-agentd -p 8893 -c http://localhost:8880 --data-workers http://localhost:8892

# 6. Media Worker (STT/TTS/Video)
./eldric-mediad -p 8894 -c http://localhost:8880 --stt-backend whisper_cpp

# 7. Comm Worker (messaging protocols)
./eldric-commd -p 8895 -c http://localhost:8880

# 8. IIoT Worker (industrial IoT)
./eldric-iiotd -p 8891 -c http://localhost:8880

# 9. Science Worker (bioinformatics, pharma)
./eldric-scienced -p 8897 -c http://localhost:8880

# 10. Training Worker (model fine-tuning)
./eldric-traind -p 8898 -c http://localhost:8880 --data-workers http://localhost:8892

# 11. Edge Server (unified gateway for all workers)
./eldric-edge -p 443 --cert /etc/ssl/cert.pem --key /etc/ssl/key.pem \
  --routers http://localhost:8881 --controller http://localhost:8880 \
  --agent-workers http://localhost:8893 --media-workers http://localhost:8894 \
  --comm-workers http://localhost:8895 --training-workers http://localhost:8898 \
  --auto-discover
            

Health Checks

curl http://localhost:8880/health   # Controller
curl http://localhost:8881/health   # Router
curl http://localhost:8889/health   # Cloud Worker
curl http://localhost:8890/health   # AI Worker
curl http://localhost:8892/health   # Data Worker
curl http://localhost:8893/health   # Agent Worker
curl http://localhost:8894/health   # Media Worker
curl http://localhost:8895/health   # Comm Worker
curl http://localhost:8891/health   # IIoT Worker
curl http://localhost:8897/health   # Science Worker
curl http://localhost:8898/health   # Training Worker
            

Swarm Architecture

Multi-agent orchestration system with tiered swarm intelligence for autonomous goal execution.

Tiered Swarm Hierarchy

Swarm Orchestrator

Port 8885

Global goal management, task decomposition, and cross-swarm coordination. Routes goals to appropriate regional swarms.

Swarm Controller

Port 8886

Regional swarm management, agent lifecycle, task scheduling, and emergent behavior detection.

Agent Session

Port 8887

Individual agent execution context with tool access, memory management, and inter-agent communication.

Execution Modes

Advisory

Swarm proposes actions, waits for human approval before each step. Maximum oversight.

Supervised

Runs autonomously but requires approval for critical operations (writes, executions, external calls).

Autonomous

Full autonomous operation with real-time monitoring. Best for trusted, well-defined goals.

Port Reference

Component	Default Port	Protocol	Purpose
Controller	`8880`	HTTP/REST	Cluster API, Dashboard, Node Registration
Router	`8881`	HTTP/REST	Request Routing, Load Balancing
AI Worker	`8890`	HTTP/REST	LLM Inference, Tool Execution
Cloud Worker	`8889`	HTTP/REST	Multi-Backend Cloud Inference Gateway
Data Worker	`8892`	HTTP/REST	Storage, Database, Vector/RAG
Agent Worker	`8893`	HTTP/REST	Agentic RAG, Multi-Agent Execution
Media Worker	`8894`	HTTP/REST	STT, TTS, Video Processing
Comm Worker	`8895`	HTTP/REST	Email, SMS, WhatsApp, Signal, Teams, XMPP
Comm Webhooks	`8896`	HTTP	Incoming Message Webhooks
IIoT Worker	`8891`	HTTP/REST	Industrial IoT, OPC UA, Modbus, MQTT
Science Worker	`8897`	HTTP/REST	Bioinformatics, Pharma, CRISPR, LIMS
Training Worker	`8898`	HTTP/REST	AI Model Training, Fine-tuning
Edge Server	`443`	HTTPS	Unified Proxy, TLS, API Auth, Rate Limiting
Swarm Orchestrator	`8885`	HTTP/REST	Goal Management, Task Decomposition
Swarm Controller	`8886`	HTTP/REST	Agent Lifecycle, Task Scheduling
Agent Session	`8887`	HTTP/REST	Agent Execution, Tool Access
Data NFS	`2049`	NFS	Filesystem Access via NFS-Ganesha

Reserved Ports (Inference Backends)

Backend	Port	Notes
Ollama	`11434`	Local LLM runtime
vLLM	`8000`	OpenAI-compatible API
TGI	`8080`	HuggingFace Text Generation
llama.cpp	`8081`	Native GGUF serving
Triton	`8000-8002`	NVIDIA multi-framework
TensorFlow Serving	`8501`	TensorFlow models

Model Distribution Flow

Models flow through the cluster via two primary paths: from public registries or from the internal Data Worker registry.

Path 1: Public Registry Distribution

Ollama Hub
Public Models

→

Controller
Model Registry

→

AI Workers
Local Cache

The controller tracks which models are available on which workers. When a model is requested that isn't cached locally, the worker pulls it from Ollama Hub on demand.

Path 2: Data Worker Registry Distribution

Data Worker
Model Storage

→

Controller
Distribution

→

AI Workers
Deploy & Serve

Fine-tuned or custom models stored on Data Workers can be distributed to inference workers via the controller. This enables private model registries without relying on public hubs.

Cloud Worker Model Routing

Client Request
model: "gpt-4o"

→

Router
Model Lookup

→

Cloud Worker
cld- prefix

→

OpenAI / xAI / Anthropic
Cloud API

Cloud Workers auto-discover models from connected cloud backends and register them with the controller. The router routes requests to the correct Cloud Worker based on model name.

Path 3: Distributed Pipeline Inference New

Data Worker
GGUF Model Store

→

Controller
GGUF Parser + Shard Coordinator

→

Workers (N)
Each loads assigned layers via NFS

For models too large for a single GPU, the Controller reads the GGUF file metadata from the Data Worker, calculates layer assignments proportional to each worker's available VRAM, and pushes shard configurations to workers. Workers start llama-rpc-server (middle/tail) or llama-server --rpc (head). The router auto-discovers the pipeline model from the head worker's heartbeat.

🧠

Head Worker

Embed + first layers
Coordinates pipeline

🔗

Middle Workers

Hidden layers
RPC server on :50052

🎯

Tail Worker

Final layers + output
Generates tokens

💾

Data Worker

NFS model store
/models/ via mmap

Controller API

POST /api/v1/pipeline/deploy
GET /api/v1/pipeline/models
GET /api/v1/pipeline/status
POST /api/v1/pipeline/rebalance
POST /api/v1/pipeline/undeploy

Data Worker API

GET /api/v1/models/:id/metadata
GET /api/v1/models/:id/tensors
POST /api/v1/models/:id/pull

Worker API

POST /api/v1/pipeline/load
POST /api/v1/pipeline/unload
GET /api/v1/pipeline/status

Full distributed inference documentation →

xLSTM Integration New

Sepp Hochreiter's extended LSTM architecture (xLSTM) is integrated across the cluster for predictive workloads, training, and anomaly detection.

Router: xLSTM Predictor

Trained xLSTM model in the router for workload forecasting. Predicts load spikes before they happen, enables pre-emptive scaling. Fast sequence classification for intent detection without LLM overhead. Anomaly detection flags unusual traffic patterns.

Training Worker: xLSTM Backend

Native xLSTM training backend with sLSTM (scalar LSTM with exponential gating) and mLSTM (matrix LSTM with covariance update) cell support. Fine-tune xLSTM models for domain-specific sequence tasks — time series, genomics, financial data.

Science Worker: Anomaly Detection

xLSTM-based anomaly detection on scientific time-series data. Trained on seismic, genomic, financial, and climate datasets. Detects outliers in real-time sensor streams from IoT and LIMS integrations.

Swarm LLM Ensemble

The Router's Swarm LLM feature sends queries to multiple models simultaneously and combines results using consensus strategies for higher-quality responses.

Ensemble Strategies

debate	Models argue over the answer, judge LLM picks the best reasoning
critique	First model generates, second critiques, synthesis from both
best_of_n	N responses generated, judge model scores and selects best
vote	Majority consensus across all participating models

How It Works

The router selects multiple workers with different models (e.g., Llama + Qwen + Mistral), sends the same query to all, then applies the selected ensemble strategy to produce a final response.

                        Client → Edge → Router (swarm=debate)

                           → Worker1 (llama3.1) → Response A

                           → Worker2 (qwen2.5)  → Response B

                           → Worker3 (mistral)  → Response C

                        Router → Judge LLM → Best response → Client

Router documentation →

Edge Gateway: Unified Proxy

Route all external traffic through a single TLS endpoint. One URL, one API key, one certificate — full access to every Eldric service.

Proxy Route Table

All routes are authenticated and rate-limited. Internal worker-to-worker calls stay direct on the private network.

Edge Route	Proxied To	Description
`/v1/chat/completions`	Router :8881	Chat completions (streaming)
`/v1/models`	Router :8881	List available models
Agent Worker
`/api/v1/agent/chat`	Agent :8893	Agentic RAG chat
`/api/v1/agent/sessions`	Agent :8893	Session management
`/api/v1/knowledge-bases`	Agent :8893	Knowledge base listing
Media Worker
`/api/v1/stt/transcribe`	Media :8894	Speech-to-text
`/api/v1/tts/synthesize`	Media :8894	Text-to-speech
`/api/v1/voice/chat`	Media :8894	Voice chat pipeline
`/api/v1/stt/models`	Media :8894	List STT models
`/api/v1/tts/voices`	Media :8894	List TTS voices
`/api/v1/media/info`	Media :8894	Worker capabilities
Communication Worker
`/api/v1/comm/accounts`	Comm :8895	Messaging accounts
`/api/v1/comm/messages`	Comm :8895	Send/list messages
`/api/v1/comm/search`	Comm :8895	Semantic message search
Training Worker
`/api/v1/training/jobs`	Training :8898	Training jobs
`/api/v1/training/backends`	Training :8898	Available backends
`/api/v1/training/gpus`	Training :8898	GPU information
Controller
`/api/v1/workers`	Controller :8880	Cluster workers
`/api/v1/routers`	Controller :8880	Cluster routers

Edge Configuration

# Start Edge with all workers (explicit)
./eldric-edge --port 443 --cert /etc/ssl/cert.pem --key /etc/ssl/key.pem \
  --routers http://router1:8881,http://router2:8881 \
  --controller http://controller:8880 \
  --agent-workers http://agent1:8893,http://agent2:8893 \
  --media-workers http://media:8894 \
  --comm-workers http://comm:8895 \
  --training-workers http://train:8898

# Start Edge with auto-discovery (workers register with controller)
./eldric-edge --port 443 --cert /etc/ssl/cert.pem --key /etc/ssl/key.pem \
  --routers http://router:8881 \
  --controller http://controller:8880 \
  --auto-discover

# HTTP only for development
./eldric-edge --no-tls --http-port 8080 \
  --routers http://localhost:8881 \
  --controller http://localhost:8880 \
  --agent-workers http://localhost:8893 \
  --auto-discover

# Farm mode for horizontal scaling
./eldric-edge --port 443 --cert cert.pem --key key.pem \
  --mode farm --peers edge2:443,edge3:443 \
  --routers http://router:8881 --controller http://ctrl:8880 --auto-discover
            

JSON Configuration

{
  "mode": "single",
  "https_port": 443,
  "http_port": 80,
  "router_urls": ["http://router1:8881", "http://router2:8881"],
  "controller_url": "http://controller:8880",
  "agent_worker_urls": ["http://agent1:8893"],
  "media_worker_urls": ["http://media:8894"],
  "comm_worker_urls": ["http://comm:8895"],
  "training_worker_urls": ["http://train:8898"],
  "auto_discover_workers": true,
  "tls": {
    "mode": "manual",
    "cert_file": "/etc/ssl/cert.pem",
    "key_file": "/etc/ssl/key.pem"
  },
  "auth": {
    "require_api_key": true,
    "api_keys": {
      "sk-ios-app": "ios-client",
      "sk-gui-client": "macos-gui",
      "sk-openwebui": "openwebui"
    }
  },
  "rate_limits": {
    "global_rpm": 10000,
    "per_ip_rpm": 100,
    "per_key_rpm": 1000
  }
}
            

Client Configuration

Point any client at the Edge URL to access all services:

# OpenWebUI
OPENAI_API_BASE_URL=https://edge.example.com/v1
OPENAI_API_KEY=sk-openwebui

# curl - Chat
curl https://edge.example.com/v1/chat/completions \
  -H "X-API-Key: sk-my-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "llama3.2:3b", "messages": [{"role": "user", "content": "Hello"}]}'

# curl - Agent RAG
curl https://edge.example.com/api/v1/agent/chat \
  -H "X-API-Key: sk-my-key" \
  -H "Content-Type: application/json" \
  -d '{"message": "Search the knowledge base", "session_id": "sess-123"}'

# curl - Transcribe audio
curl https://edge.example.com/api/v1/stt/transcribe \
  -H "X-API-Key: sk-my-key" \
  -H "Content-Type: application/json" \
  -d '{"audio_url": "/path/to/audio.mp3", "language": "en"}'

# curl - Training job
curl https://edge.example.com/api/v1/training/jobs \
  -H "X-API-Key: sk-my-key" \
  -H "Content-Type: application/json" \
  -d '{"name": "finetune", "base_model": "llama3.2:3b", "method": "lora", "backend": "mlx"}'
            

Worldwide Cluster Distribution New

Eldric cluster nodes are location-independent. Spread workers across university labs, corporate datacenters, home offices, and cloud providers. Workers connect through the Edge TLS gateway — no VPN required. Workers behind NAT use the built-in tunnel to receive inference requests through outbound-only connections.

Direct Registration via Edge

Workers with internet access register through the Edge TLS gateway. API key authentication, rate limiting, and encryption are handled by the Edge. The worker just sets --controller https://edge.example.com

Tunnel for NAT/Firewall

Workers behind NAT connect outbound to the Edge tunnel. They long-poll for inference requests — no inbound ports, no VPN, no public IP needed. The Edge queues requests and delivers them through the tunnel.

VPN / Private Network

For sites already connected via VPN or WireGuard, workers register directly with the controller on the private network. No Edge proxy needed — just --controller http://ctrl:8880

# Central datacenter (Vienna) — direct
./eldric-controller --port 8880
./eldric-edge --port 443 --cert /etc/ssl/cert.pem --key /etc/ssl/key.pem --controller http://localhost:8880

# University Lab A (Zurich) — registers through Edge
ssh zurich-lab "./eldric-workerd --backend ollama --controller https://edge.example.com --api-key sk-lab-a"

# University Lab B (Munich) — registers through Edge
ssh munich-lab "./eldric-workerd --backend vllm --controller https://edge.example.com --api-key sk-lab-b"

# Remote researcher (home office, behind NAT) — uses tunnel
./eldric-workerd --backend ollama --tunnel https://edge.example.com --api-key sk-remote-1