API reference — Eldric 5.0 GA

Recent additions — 5.0 GA

The customer-facing endpoints shipped in the run-up to GA.

These are the ten endpoints across the RAG + routing surface that customers consume directly. Internal cluster-internal paths (worker poll loops, peer aggregator internals, reindex orchestrator state) are not listed.

Intelligent upload — analyse → preview → commit → poll

Customers drag a file in; the platform inspects it, suggests chunking and enrichment parameters, lets the user preview the chunks, then commits with the chosen params. Backed by four endpoints.

# 1. Analyse a file — get suggested params back
curl -X POST https://your-cluster.example.com/api/v1/upload/analyze \
  -H "Authorization: Bearer $API_KEY" \
  -F "file=@./paper.pdf"
# → returns {content_type, language, estimated_chunks, suggested_strategy,
#            suggested_enrichment, extracted_keywords, ...}

# 2. Preview the chunks a given strategy would produce
curl -X POST https://your-cluster.example.com/api/v1/upload/preview-chunks \
  -H "Authorization: Bearer $API_KEY" \
  -F "file=@./paper.pdf" \
  -F "strategy=semantic" -F "chunk_size=512" -F "overlap=50"
# → returns first 5–10 chunks for review

# 3. Commit the ingestion job with the chosen params
curl -X POST https://your-cluster.example.com/api/v1/upload/commit \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "file_id": "...",  // from /analyze
    "tenant_id": "default",
    "namespace_id": "ai-papers",
    "strategy": "semantic",
    "chunk_size": 512,
    "overlap": 50,
    "enrichment": {"authors": true, "doi": true, "topic_tags": true}
  }'
# → returns {job_id, estimated_duration_seconds}

# 4. Poll the job
curl https://your-cluster.example.com/api/v1/upload/jobs/<job_id> \
  -H "Authorization: Bearer $API_KEY"
# → returns {status, chunks_done, total, embedding_done, ...}

Per-namespace chunking config

Set the default chunking strategy at the knowledge-base level. Uploads into that namespace skip the suggestion dialog (or pre-fill it).

curl -X POST https://your-cluster.example.com/api/v1/vector/namespaces/<tenant>/<ns>/config \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "chunk_size": 512,
    "chunk_overlap": 50,
    "strategy": "semantic"
  }'

Retention loop — accept / reject signal

Customer's thumbs-up on an assistant turn triggers the retention loop: cited sources auto-ingest into the knowledge base, the session feeds the dream cycle, and hot patterns become candidates for the next training corpus.

curl -X POST https://your-cluster.example.com/api/v1/chat/feedback \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "sess-abc123",
    "message_id": "msg-456",
    "verdict": "accept"          // or "reject"
  }'

Router — classify a query

Returns the assigned intent class with a confidence score and the source of the decision (base classifier, overlay, or LLM-fallback). Useful for debugging routing and surfacing classification in admin dashboards.

curl -X POST https://your-cluster.example.com/api/v1/router/classify \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query":"summarise the indemnification clause in contract X"}'

# → {
#     "class": "ContractReview",
#     "confidence": 0.91,
#     "source": "overlay",
#     "fallback_chain": ["base", "overlay"]
#   }

Router — custom intent classes (per-tenant)

Pro+. Admin role + tenant boundary. Register your own intent classes ("PatientTriage", "ContractReview", "AnomalyTrend") alongside the 128 built-ins. The classifier consults the overlay when the base is below confidence threshold.

# UPSERT a class
curl -X POST https://your-cluster.example.com/api/v1/router/custom-classes \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "tenant_id": "law-firm-acme",
    "class_name": "ContractReview",
    "description": "queries about clause review, redlines, indemnification",
    "examples": ["summarise the indemnification clause in...", "what's the term length on...", ...]
  }'

# List the tenant's taxonomy
curl https://your-cluster.example.com/api/v1/router/custom-classes?tenant_id=law-firm-acme \
  -H "Authorization: Bearer $API_KEY"

# Remove one class
curl -X DELETE "https://your-cluster.example.com/api/v1/router/custom-classes?tenant_id=law-firm-acme&class_name=ContractReview" \
  -H "Authorization: Bearer $API_KEY"

All endpoints above gate on the rag capability (Standard tier and above per the license catalogue). Writes additionally require the admin role inside the requesting tenant; reads are scoped to the tenant boundary at the gateway. The customer-facing how-to pages for the same surface live at using RAG, chunking strategies, RAG on demand and custom classification.

Catalogue

The 25 sections.

Grouped by component. Each section in the markdown reference lists every endpoint with method, path, auth requirements, body schema and a short description. Authorisation is capability-gated (see § Authentication in the markdown source); a future 5.1 release adds tenant-named role composition.

1–3. Public surface

Edge-reachable endpoints — OpenAI-compatible chat completions, chat & conversation, identity & account. The first place to look if you're integrating from outside the LAN.

4. Inference & models

Worker (port 8890) and Cloud Worker (port 8889) — model registry, load / unload, multi-backend dispatch. Ollama, vLLM, TGI, Triton, OpenAI-compatible, llama.cpp.

5. Routing

Intent classification, theme detection, AI-routing decisions, load-balancing strategies, ensemble mode. Includes /api/v1/router/classify for inspecting how a query gets classified and /api/v1/router/custom-classes for managing a tenant's own intent taxonomy (Pro+).

6. Data — Storage, Vector, RAG

Multi-tenant file storage, vector storage, RAG ingest + search, per-namespace chunking config, embedding providers. Includes the intelligent-upload flow (/api/v1/upload/analyze, …/preview-chunks, …/commit) and the retention-loop accept/reject endpoint (/api/v1/chat/feedback).

7. Data — Matrix memory

Hierarchical associative memory — store, recall, list matrices, checkpoint, verify integrity, forget.

8. Data — NFS, databases, connectors

NFS-ganesha exports, remote-mount management, database connectivity (SQLite / PostgreSQL / MySQL / DB2), source connectors.

9. Agent worker

Agentic RAG (port 8893) — chat, sessions, multi-agent execution, query decomposition, workflow registration, training-data generation.

10. Media worker

STT, TTS, audio analysis, video processing, multimedia RAG, voice chat (audio in → STT → LLM → TTS → audio out).

11. Communication worker

Email, SMS, WhatsApp, Signal, Teams, XMPP, VoIP — accounts, messages, conversations, semantic search, AI auto-response queue, webhook handlers.

12. Science worker

Source Registry (§43) — 16 categories, 28 seeded sources, 11 LLM tools, per-category aliases, per-provider compat endpoints.

13. Training worker

Six backends (Unsloth, Axolotl, TRL, DeepSpeed, MLX, llama.cpp). Eight methods (LoRA, QLoRA, SFT, DPO, RLHF, PPO, full fine-tune, distillation). Training chains.

14. Native inference

Inferenced (port 8883) — direct GGUF / xLSTM model loading without external backend. Speculative decoding, continuous batching, pipeline parallelism.

15. IoT worker

Consumer IoT (Netatmo, HomeKit, Matter) plus industrial protocols (OPC-UA, Modbus, MQTT Sparkplug B). Policy bindings, safety interlocks, store-and-forward.

16. Swarm

Multi-agent orchestration with six topologies (hierarchical, P2P, ring, star, mesh, hybrid). Agent workers, MCP discovery, inter-agent messaging.

17. NOVA (experimental)

Goal system, tri-memory (episodic / semantic / procedural), reasoning, meta-learner, sandboxed self-modification. Off by default; admin opt-in.

18. System & cluster

Health, metrics, cluster topology, peer registration, worker / router lists, model details, capability discovery.

19. Cluster operations

Rolling upgrade (§70), 4.x → 5.0 migration (§85), backup & DR (§40). Drain → install → restart per node, manifest verification, restore points.

20. PKI & security

Internal CA management, Let's-Encrypt ACME issuance + renewal, certificate generation, deployment, rotation, cluster-wide push.

21. Observability

OpenTelemetry / OTLP export (§90), audit ledger, span / counter / histogram surface. Low-cardinality path normalisation.

22. Marketplace & plugins

Plugin catalogue (§80), install / uninstall / update flows, manifest validation, valve configuration. Five plugin types.

23. Theming & branding

Per-tenant theming (§99). Public-readable theme + branding for the chat shell; admin-gated writes; HTML-sanitised custom CSS.

24. License

Controller-side license activation + validation; admin-only license-server endpoints for issuing and downloading signed licenses.

25. xLSTM daemon

Structured-ML worker (port 8884) — policy execution (LRAM), forecasting (TiRex), encoding (ViL), associative retrieval (Hopfield).

The complete
endpoint surface.

The two ways to call Eldric.

OpenAI-shaped curl

OpenAI Python SDK

Eldric-native agent invocation

The customer-facing endpoints shipped in the run-up to GA.

Intelligent upload — analyse → preview → commit → poll

Per-namespace chunking config

Retention loop — accept / reject signal

Router — classify a query

Router — custom intent classes (per-tenant)

The 25 sections.

1–3. Public surface

4. Inference & models

5. Routing

6. Data — Storage, Vector, RAG

7. Data — Matrix memory

8. Data — NFS, databases, connectors

9. Agent worker

10. Media worker

11. Communication worker

12. Science worker

13. Training worker

14. Native inference

15. IoT worker

16. Swarm

17. NOVA (experimental)

18. System & cluster

19. Cluster operations

20. PKI & security

21. Observability

22. Marketplace & plugins

23. Theming & branding

24. License

25. xLSTM daemon

Capability-gated endpoints.

Three related pages.

Next.

The completeendpoint surface.

The two ways to call Eldric.

OpenAI-shaped curl

OpenAI Python SDK

Eldric-native agent invocation

The customer-facing endpoints shipped in the run-up to GA.

Intelligent upload — analyse → preview → commit → poll

Per-namespace chunking config

Retention loop — accept / reject signal

Router — classify a query

Router — custom intent classes (per-tenant)

The 25 sections.

1–3. Public surface

4. Inference & models

5. Routing

6. Data — Storage, Vector, RAG

7. Data — Matrix memory

8. Data — NFS, databases, connectors

9. Agent worker

10. Media worker

11. Communication worker

12. Science worker

13. Training worker

14. Native inference

15. IoT worker

16. Swarm

17. NOVA (experimental)

18. System & cluster

19. Cluster operations

20. PKI & security

21. Observability

22. Marketplace & plugins

23. Theming & branding

24. License

25. xLSTM daemon

Capability-gated endpoints.

Three related pages.

Next.

The complete
endpoint surface.