Twenty-five sections covering every endpoint shipped in Eldric 5.0 GA — public OpenAI-compatible at the Edge, the agent / data / media / comm / science / training worker surfaces, the cluster-operations endpoints, the xLSTM daemon, and the marketplace / observability / license surfaces. This page lists the customer-facing endpoint surface; internal cluster-to-cluster paths are not enumerated here.
curl https://your-cluster.example.com/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.3-70b",
"messages": [{"role":"user","content":"Hello."}],
"stream": true
}'
from openai import OpenAI
client = OpenAI(
base_url="https://your-cluster.example.com/v1",
api_key="$API_KEY",
)
resp = client.chat.completions.create(
model="llama-3.3-70b",
messages=[{"role":"user","content":"Hello."}],
)
print(resp.choices[0].message.content)
curl -X POST https://your-cluster.example.com/api/v1/agent/chat \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"message": "Summarise the new contract attachments",
"knowledge_base_id": "kb-contracts-2026"
}'
These are the ten endpoints across the RAG + routing surface that customers consume directly. Internal cluster-internal paths (worker poll loops, peer aggregator internals, reindex orchestrator state) are not listed.
Customers drag a file in; the platform inspects it, suggests chunking and enrichment parameters, lets the user preview the chunks, then commits with the chosen params. Backed by four endpoints.
# 1. Analyse a file — get suggested params back
curl -X POST https://your-cluster.example.com/api/v1/upload/analyze \
-H "Authorization: Bearer $API_KEY" \
-F "file=@./paper.pdf"
# → returns {content_type, language, estimated_chunks, suggested_strategy,
# suggested_enrichment, extracted_keywords, ...}
# 2. Preview the chunks a given strategy would produce
curl -X POST https://your-cluster.example.com/api/v1/upload/preview-chunks \
-H "Authorization: Bearer $API_KEY" \
-F "file=@./paper.pdf" \
-F "strategy=semantic" -F "chunk_size=512" -F "overlap=50"
# → returns first 5–10 chunks for review
# 3. Commit the ingestion job with the chosen params
curl -X POST https://your-cluster.example.com/api/v1/upload/commit \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"file_id": "...", // from /analyze
"tenant_id": "default",
"namespace_id": "ai-papers",
"strategy": "semantic",
"chunk_size": 512,
"overlap": 50,
"enrichment": {"authors": true, "doi": true, "topic_tags": true}
}'
# → returns {job_id, estimated_duration_seconds}
# 4. Poll the job
curl https://your-cluster.example.com/api/v1/upload/jobs/<job_id> \
-H "Authorization: Bearer $API_KEY"
# → returns {status, chunks_done, total, embedding_done, ...}
Set the default chunking strategy at the knowledge-base level. Uploads into that namespace skip the suggestion dialog (or pre-fill it).
curl -X POST https://your-cluster.example.com/api/v1/vector/namespaces/<tenant>/<ns>/config \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"chunk_size": 512,
"chunk_overlap": 50,
"strategy": "semantic"
}'
Customer's thumbs-up on an assistant turn triggers the retention loop: cited sources auto-ingest into the knowledge base, the session feeds the dream cycle, and hot patterns become candidates for the next training corpus.
curl -X POST https://your-cluster.example.com/api/v1/chat/feedback \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"session_id": "sess-abc123",
"message_id": "msg-456",
"verdict": "accept" // or "reject"
}'
Returns the assigned intent class with a confidence score and the source of the decision (base classifier, overlay, or LLM-fallback). Useful for debugging routing and surfacing classification in admin dashboards.
curl -X POST https://your-cluster.example.com/api/v1/router/classify \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{"query":"summarise the indemnification clause in contract X"}'
# → {
# "class": "ContractReview",
# "confidence": 0.91,
# "source": "overlay",
# "fallback_chain": ["base", "overlay"]
# }
Pro+. Admin role + tenant boundary. Register your own intent classes ("PatientTriage", "ContractReview", "AnomalyTrend") alongside the 128 built-ins. The classifier consults the overlay when the base is below confidence threshold.
# UPSERT a class
curl -X POST https://your-cluster.example.com/api/v1/router/custom-classes \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"tenant_id": "law-firm-acme",
"class_name": "ContractReview",
"description": "queries about clause review, redlines, indemnification",
"examples": ["summarise the indemnification clause in...", "what's the term length on...", ...]
}'
# List the tenant's taxonomy
curl https://your-cluster.example.com/api/v1/router/custom-classes?tenant_id=law-firm-acme \
-H "Authorization: Bearer $API_KEY"
# Remove one class
curl -X DELETE "https://your-cluster.example.com/api/v1/router/custom-classes?tenant_id=law-firm-acme&class_name=ContractReview" \
-H "Authorization: Bearer $API_KEY"
All endpoints above gate on the rag capability (Standard tier and above per the license catalogue). Writes additionally require the admin role inside the requesting tenant; reads are scoped to the tenant boundary at the gateway. The customer-facing how-to pages for the same surface live at using RAG, chunking strategies, RAG on demand and custom classification.
Grouped by component. Each section in the markdown reference lists every endpoint with method, path, auth requirements, body schema and a short description. Authorisation is capability-gated (see § Authentication in the markdown source); a future 5.1 release adds tenant-named role composition.
Edge-reachable endpoints — OpenAI-compatible chat completions, chat & conversation, identity & account. The first place to look if you're integrating from outside the LAN.
Worker (port 8890) and Cloud Worker (port 8889) — model registry, load / unload, multi-backend dispatch. Ollama, vLLM, TGI, Triton, OpenAI-compatible, llama.cpp.
Intent classification, theme detection, AI-routing decisions, load-balancing strategies, ensemble mode. Includes /api/v1/router/classify for inspecting how a query gets classified and /api/v1/router/custom-classes for managing a tenant's own intent taxonomy (Pro+).
Multi-tenant file storage, vector storage, RAG ingest + search, per-namespace chunking config, embedding providers. Includes the intelligent-upload flow (/api/v1/upload/analyze, …/preview-chunks, …/commit) and the retention-loop accept/reject endpoint (/api/v1/chat/feedback).
Hierarchical associative memory — store, recall, list matrices, checkpoint, verify integrity, forget.
NFS-ganesha exports, remote-mount management, database connectivity (SQLite / PostgreSQL / MySQL / DB2), source connectors.
Agentic RAG (port 8893) — chat, sessions, multi-agent execution, query decomposition, workflow registration, training-data generation.
STT, TTS, audio analysis, video processing, multimedia RAG, voice chat (audio in → STT → LLM → TTS → audio out).
Email, SMS, WhatsApp, Signal, Teams, XMPP, VoIP — accounts, messages, conversations, semantic search, AI auto-response queue, webhook handlers.
Source Registry (§43) — 16 categories, 28 seeded sources, 11 LLM tools, per-category aliases, per-provider compat endpoints.
Six backends (Unsloth, Axolotl, TRL, DeepSpeed, MLX, llama.cpp). Eight methods (LoRA, QLoRA, SFT, DPO, RLHF, PPO, full fine-tune, distillation). Training chains.
Inferenced (port 8883) — direct GGUF / xLSTM model loading without external backend. Speculative decoding, continuous batching, pipeline parallelism.
Consumer IoT (Netatmo, HomeKit, Matter) plus industrial protocols (OPC-UA, Modbus, MQTT Sparkplug B). Policy bindings, safety interlocks, store-and-forward.
Multi-agent orchestration with six topologies (hierarchical, P2P, ring, star, mesh, hybrid). Agent workers, MCP discovery, inter-agent messaging.
Goal system, tri-memory (episodic / semantic / procedural), reasoning, meta-learner, sandboxed self-modification. Off by default; admin opt-in.
Health, metrics, cluster topology, peer registration, worker / router lists, model details, capability discovery.
Rolling upgrade (§70), 4.x → 5.0 migration (§85), backup & DR (§40). Drain → install → restart per node, manifest verification, restore points.
Internal CA management, Let's-Encrypt ACME issuance + renewal, certificate generation, deployment, rotation, cluster-wide push.
OpenTelemetry / OTLP export (§90), audit ledger, span / counter / histogram surface. Low-cardinality path normalisation.
Plugin catalogue (§80), install / uninstall / update flows, manifest validation, valve configuration. Five plugin types.
Per-tenant theming (§99). Public-readable theme + branding for the chat shell; admin-gated writes; HTML-sanitised custom CSS.
Controller-side license activation + validation; admin-only license-server endpoints for issuing and downloading signed licenses.
Structured-ML worker (port 8884) — policy execution (LRAM), forecasting (TiRex), encoding (ViL), associative retrieval (Hopfield).
Every endpoint declares a required capability in the kernel binary. On a 403 response, the JSON body carries a required_capability field telling your client what was missing. Your chat shell and plugin host should surface that field; the built-in chat shell does. Webhook subscribers should respect the same field on inbound responses.
In 5.0, the existing four roles (Viewer / Developer / Admin / SuperAdmin) map onto the capability set under the hood. In 5.1, tenant admins compose customer-named roles from the underlying capabilities — your integration won't need to change, but admins gain the ability to name and grant their own roles. See release notes for the deferred 5.1 RBAC scope.
For the developer overview, including code examples: for developers. To install: get started. For the platform's overall data posture (relevant to tenant-scoped queries): your data. Specific questions: office@eldric.ai.