This page is for the person whose job is keeping an Eldric installation healthy day-to-day — install, users, tenants, monitoring, backup, upgrade. It walks the standard ops surface in order; for the deep references (every endpoint, every flag) follow the links into the API reference and the per-feature pages.
An Eldric cluster has one controller, one or more routers, one or more inference workers, one or more data workers, and an optional ring of specialised workers (agent / media / comm / science / training / xLSTM / IoT). The edge gateway is the public entry point. All daemons run as systemd units on each host.
For a small evaluation cluster, a single host runs everything via the eldric-aios meta-package. For production, you split across hosts — typically the GPU-equipped boxes run inference + xLSTM, the storage-heavy boxes run data workers, and a small management host runs the controller + edge. The install guide covers the single-node path; the multi-node path is below.
One host, one meta-package. The recommended single-command path:
curl -fsSL https://repo.eldric.ai/install-eldric.sh | sudo bash
That bootstrap chains the repository setup, dnf install eldric-aios, systemctl enable --now eldric-aios, and the post-install eldric setup health probe + summary. Pass -s -- --license-file PATH --admin-email EMAIL to activate a license file in the same shot.
Prefer the steps separately:
curl -fsSL https://repo.eldric.ai/install.sh | sudo bash sudo dnf install eldric-aios sudo systemctl enable --now eldric-aios sudo eldric setup
Within 30 seconds the chat shell is at https://<host>/chat. (On a single node without a dedicated edge gateway, Eldric serves it on the local admin port until you add an edge.) First signup becomes admin. See first run for what to do next.
Eldric 5.0 ships as one meta-package per host (eldric-aios); GPU hosts add the CUDA RPM (eldric-aios-cuda). The same install command runs on every node — what differs is the role assignment, set via environment file at /etc/eldric/eldric-aios.env.
# Every host — same one-liner
curl -fsSL https://repo.eldric.ai/install-eldric.sh | sudo bash
# GPU hosts add the CUDA package
sudo dnf install eldric-aios-cuda
# On each non-controller host, point at the controller and pin the role
echo "ELDRIC_AIOS_CONTROLLER_URL=https://mgmt-host:8880" | \
sudo tee -a /etc/eldric/eldric-aios.env
echo "ELDRIC_AIOS_ROLE=worker" | sudo tee -a /etc/eldric/eldric-aios.env # or controller / data / edge / agent / media / comm / science / training / iot
sudo systemctl restart eldric-aios
Each daemon registers itself with the controller on first start. systemctl status eldric-aios on each host confirms the lifecycle; the unified meta-unit hosts every module on that node. The cluster topology page in the chat shell (Admin Console → Cluster) shows the registered nodes in real time. Role-pinning matters — leaving a bare node on the default role=all can trigger health watchdog crash-loops on hardware that doesn't fit the full stack.
Admin Console → Users to add, suspend or remove users. Roles are Viewer / Developer / Admin / SuperAdmin (the latter for cross-tenant operations only). Admin Console → Tenants to add new tenants — one per department / study / project / customer. Per-tenant scope is enforced at the gateway; cross-tenant access is denied unconditionally.
Walkthrough — onboarding a new department: (1) create the tenant (Tenants → New) with a short slug; (2) assign a per-tenant storage quota; (3) add the department head as Admin of that tenant; (4) the Admin invites their users via the Admin Console of their own tenant. The platform-level SuperAdmin steps out at this point — day-to-day administration lives inside the tenant.
Admin Console → KBs to provision per-tenant knowledge bases. Each KB has its own embedding model + vector storage + (optional) matrix-memory layer. The compressed-memory preview lives here — see advanced retrieval for the opt-in path.
Walkthrough — adding documents: (1) KBs → New KB → pick embedding model and the optional matrix-memory tier; (2) KB → Upload → drop PDF / DOCX / Markdown / HTML / plain text (or pull from a Data Worker mount); (3) the embedding pipeline runs in the background — track in KBs → Status; (4) chat against the KB by selecting it in the chat shell's source picker, or query directly via the API. Re-embedding after a model change rebuilds the entire KB in place; no manual rollover needed.
Admin Console → Models to manage which models are visible per tenant — show all, restrict to a curated list, hide external APIs entirely. The backend badges (Ollama, OpenAI, Inferenced, vLLM, llama.cpp, and so on — see model providers) are auto-derived from each model's source. Pinning a model as the per-tenant default makes it the entry point for new conversations.
Admin Console → License to drop in your signed license file. The controller verifies the Ed25519 signature on the file and lifts limits accordingly. Mid-licence updates are hot — no restart. License email: license@core.at.
journalctl -u eldric-aios tails the unified meta-unit (the 5.0 daemon hosts every module that node's role enabled). The audit ledger at /var/lib/eldric/audit/ hash-chains every admin-side action and AI-assisted decision — defensible record for compliance reviews. The ledger is append-only and tamper-evident; an admin reading the ledger cannot edit prior entries, even via direct file access. Admin Console → Audit exports a slice of the ledger as signed JSON for compliance handoff.
/health at its primary port. A simple liveness probe from your monitoring stack hits these./metrics in Prometheus format. Standard counters (request rate, error rate, latency percentiles) plus per-tenant / per-model breakdowns.Recommended alerts to wire into your existing stack:
/v1/chat/completions above your service objective — typically 2× the median over a rolling window. Fires when an inference worker, a backend model or a cloud provider has degraded.The Admin Console → Telemetry page suggests sensible defaults for each. Tune to your traffic shape; alerts that never fire are noise to your on-call.
Two backup paths cover the cluster state:
For offsite / disaster-recovery copies, mount your offsite storage on the data worker and point the snapshot system at it — the 5.0 path. an upcoming 5.0.x patch adds the offsite-destination automation.
The controller runs a rolling-update orchestrator that walks every node in turn: drain → install → restart → verify, then move on. From the Admin Console → Updates, pick the target version and start; the orchestrator handles the sequence and reports per-node status.
For single-node installs the standard sudo dnf update eldric-aios works directly. For air-gapped clusters, mirror repo.eldric.ai/5.0/ locally and point dnf at the mirror.
Rollback automation arrives in an upcoming 5.0.x patch. On 5.0, rollback is manual: pin the previous version and re-run the orchestrator.
The platform ships a stress-test harness — parallel-user × request-count load against a cluster host with pass/fail thresholds for p99 latency and error budget. Run it before the soak window when you're commissioning a cluster, and re-run after a meaningful capacity change. The results compare against the published demo-cluster baseline.
journalctl -u eldric-aios --since today --no-pager — first place to look for any daemon issue.For the developer-side view: for developers + API reference. For the deeper-on-the-platform view: how it works. Questions: office@eldric.ai.