This page is for the person whose job is keeping an Eldric installation healthy day-to-day — install, users, tenants, monitoring, backup, upgrade. It walks the standard ops surface in order; for the deep references (every endpoint, every flag) follow the links into the API reference and the per-feature pages.
An Eldric cluster has one controller, one or more routers, one or more inference workers, one or more data workers, and an optional ring of specialised workers (agent / media / comm / science / training / xLSTM / IoT). The edge gateway is the public entry point. All daemons run as systemd units on each host.
For a small evaluation cluster, a single host runs everything via the eldric-aios meta-package. For production, you split across hosts — typically the GPU-equipped boxes run inference + xLSTM, the storage-heavy boxes run data workers, and a small management host runs the controller + edge. The install guide covers the single-node path; the multi-node path is below.
One host, one meta-package:
curl -fsSL https://repo.eldric.ai/install.sh | sudo bash sudo dnf install eldric-aios sudo systemctl enable --now eldric-aios
Within 30 seconds, the chat shell is at https://<host>/chat. First signup becomes admin. See first run for the post-install setup.
Same install command on every host, but with a role flag:
# Management host
sudo dnf install eldric-aios-controller eldric-aios-edge
# Inference hosts (GPU-equipped)
sudo dnf install eldric-aios-worker eldric-aios-inferenced
sudo systemctl set-environment ELDRIC_CONTROLLER=https://mgmt-host:8880
# Data hosts
sudo dnf install eldric-aios-data
# Optional specialised
sudo dnf install eldric-aios-{agent,media,comm,science,training,xlstmd,iiotd}
Each daemon registers itself with the controller on first start. systemctl status eldric-* on each host confirms the lifecycle. The cluster topology page in the chat shell (Admin Console → Cluster) shows the registered workers in real time.
Admin Console → Users to add, suspend or remove users. Roles are Viewer / Developer / Admin / SuperAdmin (the latter for cross-tenant operations only). Admin Console → Tenants to add new tenants — one per department / study / project / customer. Per-tenant scope is enforced at the gateway; cross-tenant access is denied unconditionally.
Walkthrough — onboarding a new department: (1) create the tenant (Tenants → New) with a short slug; (2) assign a per-tenant storage quota; (3) add the department head as Admin of that tenant; (4) the Admin invites their users via the Admin Console of their own tenant. The platform-level SuperAdmin steps out at this point — day-to-day administration lives inside the tenant.
Admin Console → KBs to provision per-tenant knowledge bases. Each KB has its own embedding model + vector storage + (optional) matrix-memory layer. The compressed-memory preview lives here — see advanced retrieval for the opt-in path.
Walkthrough — adding documents: (1) KBs → New KB → pick embedding model and the optional matrix-memory tier; (2) KB → Upload → drop PDF / DOCX / Markdown / HTML / plain text (or pull from a Data Worker mount); (3) the embedding pipeline runs in the background — track in KBs → Status; (4) chat against the KB by selecting it in the chat shell's source picker, or query directly via the API. Re-embedding after a model change rebuilds the entire KB in place; no manual rollover needed.
Admin Console → Models to manage which models are visible per tenant — show all, restrict to a curated list, hide external APIs entirely. The backend badges (Ollama, OpenAI, Inferenced, vLLM, llama.cpp, and so on — see model providers) are auto-derived from each model's source. Pinning a model as the per-tenant default makes it the entry point for new conversations.
Admin Console → License to drop in your signed license file. The controller verifies the Ed25519 signature on the file and lifts limits accordingly. Mid-licence updates are hot — no restart. License email: license@core.at.
journalctl -u eldric-aios for the unified meta-unit; per-daemon journalctl -u eldric-aios-controller etc. The audit ledger at /var/lib/eldric/audit/ hash-chains every admin-side action and AI-assisted decision — defensible record for compliance reviews. The ledger is append-only and tamper-evident; an admin reading the ledger cannot edit prior entries, even via direct file access. Admin Console → Audit exports a slice of the ledger as signed JSON for compliance handoff.
/health at its primary port. A simple liveness probe from your monitoring stack hits these./metrics in Prometheus format. Standard counters (request rate, error rate, latency percentiles) plus per-tenant / per-model breakdowns.Recommended alerts to wire into your existing stack:
/v1/chat/completions above your service objective — typically 2× the median over a rolling window. Fires when an inference worker, a backend model or a cloud provider has degraded.The Admin Console → Telemetry page suggests sensible defaults for each. Tune to your traffic shape; alerts that never fire are noise to your on-call.
Two backup paths cover the cluster state:
For offsite / disaster-recovery copies, mount your offsite storage on the data worker and point the snapshot system at it — the 5.0 path. 5.1 adds the offsite-destination automation.
The controller runs a rolling-update orchestrator that walks every node in turn: drain → install → restart → verify, then move on. From the Admin Console → Updates, pick the target version and start; the orchestrator handles the sequence and reports per-node status.
For single-node installs the standard sudo dnf update eldric-aios works directly. For air-gapped clusters, mirror repo.eldric.ai/5.0/ locally and point dnf at the mirror.
Rollback automation arrives with 5.1 (§70). On 5.0, rollback is manual: pin the previous version and re-run the orchestrator.
The platform ships a stress-test harness — parallel-user × request-count load against a cluster host with pass/fail thresholds for p99 latency and error budget. Run it before the soak window when you're commissioning a cluster, and re-run after a meaningful capacity change. The results compare against the published demo-cluster baseline.
journalctl -u eldric-aios --since today --no-pager — first place to look for any daemon issue.For the developer-side view: for developers + API reference. For the deeper-on-the-platform view: how it works. For the GA prep: road to 5.0 GA. Questions: office@eldric.ai.