API reference

The complete
endpoint surface.

Twenty-five sections covering every endpoint shipped in Eldric 5.0 GA — public OpenAI-compatible at the Edge, the agent / data / media / comm / science / training worker surfaces, the cluster-operations endpoints, the xLSTM daemon, and the marketplace / observability / license surfaces. This page lists the customer-facing endpoint surface; internal cluster-to-cluster paths are not enumerated here.


Quick start — code examples

The two ways to call Eldric.

OpenAI-shaped curl

curl https://your-cluster.example.com/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.3-70b",
    "messages": [{"role":"user","content":"Hello."}],
    "stream": true
  }'

OpenAI Python SDK

from openai import OpenAI
client = OpenAI(
    base_url="https://your-cluster.example.com/v1",
    api_key="$API_KEY",
)
resp = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role":"user","content":"Hello."}],
)
print(resp.choices[0].message.content)

Eldric-native agent invocation

curl -X POST https://your-cluster.example.com/api/v1/agent/chat \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Summarise the new contract attachments",
    "knowledge_base_id": "kb-contracts-2026"
  }'

Recent additions — 5.0 GA

The customer-facing endpoints shipped in the run-up to GA.

These are the ten endpoints across the RAG + routing surface that customers consume directly. Internal cluster-internal paths (worker poll loops, peer aggregator internals, reindex orchestrator state) are not listed.

Intelligent upload — analyse → preview → commit → poll

Customers drag a file in; the platform inspects it, suggests chunking and enrichment parameters, lets the user preview the chunks, then commits with the chosen params. Backed by four endpoints.

# 1. Analyse a file — get suggested params back
curl -X POST https://your-cluster.example.com/api/v1/upload/analyze \
  -H "Authorization: Bearer $API_KEY" \
  -F "file=@./paper.pdf"
# → returns {content_type, language, estimated_chunks, suggested_strategy,
#            suggested_enrichment, extracted_keywords, ...}

# 2. Preview the chunks a given strategy would produce
curl -X POST https://your-cluster.example.com/api/v1/upload/preview-chunks \
  -H "Authorization: Bearer $API_KEY" \
  -F "file=@./paper.pdf" \
  -F "strategy=semantic" -F "chunk_size=512" -F "overlap=50"
# → returns first 5–10 chunks for review

# 3. Commit the ingestion job with the chosen params
curl -X POST https://your-cluster.example.com/api/v1/upload/commit \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "file_id": "...",  // from /analyze
    "tenant_id": "default",
    "namespace_id": "ai-papers",
    "strategy": "semantic",
    "chunk_size": 512,
    "overlap": 50,
    "enrichment": {"authors": true, "doi": true, "topic_tags": true}
  }'
# → returns {job_id, estimated_duration_seconds}

# 4. Poll the job
curl https://your-cluster.example.com/api/v1/upload/jobs/<job_id> \
  -H "Authorization: Bearer $API_KEY"
# → returns {status, chunks_done, total, embedding_done, ...}

Per-namespace chunking config

Set the default chunking strategy at the knowledge-base level. Uploads into that namespace skip the suggestion dialog (or pre-fill it).

curl -X POST https://your-cluster.example.com/api/v1/vector/namespaces/<tenant>/<ns>/config \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "chunk_size": 512,
    "chunk_overlap": 50,
    "strategy": "semantic"
  }'

Retention loop — accept / reject signal

Customer's thumbs-up on an assistant turn triggers the retention loop: cited sources auto-ingest into the knowledge base, the session feeds the dream cycle, and hot patterns become candidates for the next training corpus.

curl -X POST https://your-cluster.example.com/api/v1/chat/feedback \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "sess-abc123",
    "message_id": "msg-456",
    "verdict": "accept"          // or "reject"
  }'

Router — classify a query

Returns the assigned intent class with a confidence score and the source of the decision (base classifier, overlay, or LLM-fallback). Useful for debugging routing and surfacing classification in admin dashboards.

curl -X POST https://your-cluster.example.com/api/v1/router/classify \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query":"summarise the indemnification clause in contract X"}'

# → {
#     "class": "ContractReview",
#     "confidence": 0.91,
#     "source": "overlay",
#     "fallback_chain": ["base", "overlay"]
#   }

Router — custom intent classes (per-tenant)

Pro+. Admin role + tenant boundary. Register your own intent classes ("PatientTriage", "ContractReview", "AnomalyTrend") alongside the 128 built-ins. The classifier consults the overlay when the base is below confidence threshold.

# UPSERT a class
curl -X POST https://your-cluster.example.com/api/v1/router/custom-classes \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "tenant_id": "law-firm-acme",
    "class_name": "ContractReview",
    "description": "queries about clause review, redlines, indemnification",
    "examples": ["summarise the indemnification clause in...", "what's the term length on...", ...]
  }'

# List the tenant's taxonomy
curl https://your-cluster.example.com/api/v1/router/custom-classes?tenant_id=law-firm-acme \
  -H "Authorization: Bearer $API_KEY"

# Remove one class
curl -X DELETE "https://your-cluster.example.com/api/v1/router/custom-classes?tenant_id=law-firm-acme&class_name=ContractReview" \
  -H "Authorization: Bearer $API_KEY"

All endpoints above gate on the rag capability (Standard tier and above per the license catalogue). Writes additionally require the admin role inside the requesting tenant; reads are scoped to the tenant boundary at the gateway. The customer-facing how-to pages for the same surface live at using RAG, chunking strategies, RAG on demand and custom classification.


Catalogue

The 25 sections.

Grouped by component. Each section in the markdown reference lists every endpoint with method, path, auth requirements, body schema and a short description. Authorisation is capability-gated (see § Authentication in the markdown source); a future 5.1 release adds tenant-named role composition.

1–3. Public surface

Edge-reachable endpoints — OpenAI-compatible chat completions, chat & conversation, identity & account. The first place to look if you're integrating from outside the LAN.

4. Inference & models

Worker (port 8890) and Cloud Worker (port 8889) — model registry, load / unload, multi-backend dispatch. Ollama, vLLM, TGI, Triton, OpenAI-compatible, llama.cpp.

5. Routing

Intent classification, theme detection, AI-routing decisions, load-balancing strategies, ensemble mode. Includes /api/v1/router/classify for inspecting how a query gets classified and /api/v1/router/custom-classes for managing a tenant's own intent taxonomy (Pro+).

6. Data — Storage, Vector, RAG

Multi-tenant file storage, vector storage, RAG ingest + search, per-namespace chunking config, embedding providers. Includes the intelligent-upload flow (/api/v1/upload/analyze, …/preview-chunks, …/commit) and the retention-loop accept/reject endpoint (/api/v1/chat/feedback).

7. Data — Matrix memory

Hierarchical associative memory — store, recall, list matrices, checkpoint, verify integrity, forget.

8. Data — NFS, databases, connectors

NFS-ganesha exports, remote-mount management, database connectivity (SQLite / PostgreSQL / MySQL / DB2), source connectors.

9. Agent worker

Agentic RAG (port 8893) — chat, sessions, multi-agent execution, query decomposition, workflow registration, training-data generation.

10. Media worker

STT, TTS, audio analysis, video processing, multimedia RAG, voice chat (audio in → STT → LLM → TTS → audio out).

11. Communication worker

Email, SMS, WhatsApp, Signal, Teams, XMPP, VoIP — accounts, messages, conversations, semantic search, AI auto-response queue, webhook handlers.

12. Science worker

Source Registry (§43) — 16 categories, 28 seeded sources, 11 LLM tools, per-category aliases, per-provider compat endpoints.

13. Training worker

Six backends (Unsloth, Axolotl, TRL, DeepSpeed, MLX, llama.cpp). Eight methods (LoRA, QLoRA, SFT, DPO, RLHF, PPO, full fine-tune, distillation). Training chains.

14. Native inference

Inferenced (port 8883) — direct GGUF / xLSTM model loading without external backend. Speculative decoding, continuous batching, pipeline parallelism.

15. IoT worker

Consumer IoT (Netatmo, HomeKit, Matter) plus industrial protocols (OPC-UA, Modbus, MQTT Sparkplug B). Policy bindings, safety interlocks, store-and-forward.

16. Swarm

Multi-agent orchestration with six topologies (hierarchical, P2P, ring, star, mesh, hybrid). Agent workers, MCP discovery, inter-agent messaging.

17. NOVA (experimental)

Goal system, tri-memory (episodic / semantic / procedural), reasoning, meta-learner, sandboxed self-modification. Off by default; admin opt-in.

18. System & cluster

Health, metrics, cluster topology, peer registration, worker / router lists, model details, capability discovery.

19. Cluster operations

Rolling upgrade (§70), 4.x → 5.0 migration (§85), backup & DR (§40). Drain → install → restart per node, manifest verification, restore points.

20. PKI & security

Internal CA management, Let's-Encrypt ACME issuance + renewal, certificate generation, deployment, rotation, cluster-wide push.

21. Observability

OpenTelemetry / OTLP export (§90), audit ledger, span / counter / histogram surface. Low-cardinality path normalisation.

22. Marketplace & plugins

Plugin catalogue (§80), install / uninstall / update flows, manifest validation, valve configuration. Five plugin types.

23. Theming & branding

Per-tenant theming (§99). Public-readable theme + branding for the chat shell; admin-gated writes; HTML-sanitised custom CSS.

24. License

Controller-side license activation + validation; admin-only license-server endpoints for issuing and downloading signed licenses.

25. xLSTM daemon

Structured-ML worker (port 8884) — policy execution (LRAM), forecasting (TiRex), encoding (ViL), associative retrieval (Hopfield).


Authorisation

Capability-gated endpoints.

Every endpoint declares a required capability in the kernel binary. On a 403 response, the JSON body carries a required_capability field telling your client what was missing. Your chat shell and plugin host should surface that field; the built-in chat shell does. Webhook subscribers should respect the same field on inbound responses.

In 5.0, the existing four roles (Viewer / Developer / Admin / SuperAdmin) map onto the capability set under the hood. In 5.1, tenant admins compose customer-named roles from the underlying capabilities — your integration won't need to change, but admins gain the ability to name and grant their own roles. See release notes for the deferred 5.1 RBAC scope.


Where to go from here

Three related pages.


Next.

For the developer overview, including code examples: for developers. To install: get started. For the platform's overall data posture (relevant to tenant-scoped queries): your data. Specific questions: office@eldric.ai.