Automotive & robotics · use case

The shop-floor AI that doesn’t phone home

by Juergen Paulhart · 2026-04-24 · ~8 min read

“Line 3 is running hot. The technician opens five tools, reads three SOPs, calls the retired specialist, and checks the historian. We have one process, six interfaces, zero memory.”
PLANT FLOOR ELDRIC — inside the plant perimeter TECHNICIAN PLCs / SCADA OPC-UA Legacy cells Modbus TCP/RTU Field / IIoT MQTT Sparkplug B Historian time-series 10y SOPs / manuals NFS + PDFs ECU archive .a2l · .mdf · .dbc eldric-aios iot module (iiotd) OPC-UA · Modbus · MQTT store-and-forward buffering data module historian + NFS + vector Matrix Memory: robotics 128/512 cross-platform failure-pattern recall matrix-vector in milliseconds llama.cpp on-prem 3× RTX 4090 · 70B Q4 no cloud · plant network only IATF 16949 audit log traceable decisions per work-order Technician chat "why is line 3 running hot?" answer in seconds + cited tag values + SOP section prior incident: 2021-Q3

Automotive OEMs, tier-one suppliers, and robotics integrators share a compact AI wish-list: an assistant on the plant floor that can read PLCs and SCADA, a test-bench corpus an engineer can query, a pattern memory that survives model-year boundaries, and a strict posture that IP stays inside the facility perimeter.

Eldric AI OS ships the relevant pieces today. See also the industrial AI ops-assistant article for the broader positioning.

Value propositions

Plant-floor protocols native

The iot module speaks OPC-UA, Modbus TCP/RTU, and MQTT Sparkplug B directly — no SCADA-to-REST middleware. Live tag values enter the fan-out as citeable sources.

Store-and-forward reliability

WAN flap? Buffer drains on reconnect. Cell-level islanding doesn’t lose telemetry. The AI still answers questions about last shift after the link was down.

ECU archive search

Decade-long .a2l / .mdf / .dbc archives live on NFS. The data module indexes them; the chat answers “which calibration hit 98% on the cold-start test?” with a pointer.

Cross-platform Matrix Memory

Per-program tenant, per-platform workgroup. Matrix Memory compresses failure patterns across model years. The ‘did we see this before’ question gets answered in milliseconds.

IATF 16949-shaped audit

Hash-chained audit log per work-order. Traceability requirements met by construction, not by bolt-on tooling.

IP stays inside the plant

On-prem llama.cpp. No supplier chat content lands in a hyperscaler’s logs. Tier-one and OEM confidentiality boundaries are tenant boundaries.

AI-driven differentiator

The industrial-AI market has been dominated by SaaS ingestion platforms that assume your PLCs can talk to their cloud. They can’t, they shouldn’t, and your safety engineer will say no. Eldric inverts the assumption: the AI runs where the PLC already is, speaks the protocol the PLC already speaks, and stays inside the plant network. That’s the shape that actually passes shop-floor IT review.

Scalable use cases

Runs on commodity hardware

Eldric AI OS was built to land on small clusters, not on hyperscaler fleets. The whole stack is one binary; the on-prem LLM is embedded llama.cpp. The hardware plan that gets most organisations into production looks like this:

3× RTX 4090 — sweet spot

72 GB total VRAM with tensor-split. Llama 3.3 70B Q4 at 60–80 tok/s, a parallel 8B routing model, and an embedding server concurrently. One-time hardware cost ~€5–7k.

Single RTX 4090 / 4080 — team scale

24 GB. Llama 3.1 8B at 80+ tok/s, 13B comfortable, 32B Q4 possible. Enough for a small department chat with fan-out retrieval.

CPU-only — pilot scale

llama.cpp on 32+ core x86 runs 8B Q4 usefully. Matrix Memory is CPU-memory-bound. A refurbished server from the rack is enough to prove the architecture.

Scale up

Multi-node cluster with H100 / GH200 for research-grade workloads. Same binary, same role modules, topology-aware. See the HPC article.

Plant edge baseline

One 3×4090 node at the plant edge runs inference + iiotd for a whole assembly line. No WAN dependency means no downtime when the MPLS link to HQ blinks.

The arithmetic: a €6k workstation displaces a €30–60k-per-year SaaS-AI contract that still leaks IP, still can’t reach your mainframe, and still has a “we may use your data for training” clause hiding somewhere.

What the disk bill looks like

ArtefactSizeNotes
eldric-aios-5.0.0-3.alpha3.fc43.x86_64.rpm~1.4 MBCPU baseline binary; one RPM, one systemd unit.
eldric-aios-cuda add-on~512 MBPulled in automatically via Supplements: cuda-drivers on GPU hosts. Contains GGML_CUDA llama.cpp.
Llama 3.1 8B Q4_K_M GGUF~4.9 GBGood default for team-scale chat on a single 4090.
Llama 3.3 70B Q4_K_M GGUF~40 GBThe sweet spot for 3×4090 tensor-split. Holds a 16k context comfortably.
Mixtral 8x22B Q4 GGUF~80 GBTight on 3×4090; comfortable on 4×4090 or 2×H100.
nomic-embed-text (embedding)~700 MBCPU or GPU. One per cluster; handles vector indexing.
Matrix Memory .emm per domain50–500 MBDepends on rank × dim (see memory article). chat 64/768 ~200 kB; particle_physics 512/1024 ~500 MB.
Vector store per 1M chunks~6–10 GBDepends on embedding dim. SQLite backend; FAISS optional.
Hash-chained audit log~200 MB / 1M callsJSONL, append-only, rotation at 500 MB files by default.

Three reference hardware setups

Pilot / teamDepartment / BUProduction / enterprise
CPU1× EPYC 7313 (16c) or i9-14900K2× EPYC 9354 (32c each)2× EPYC 9654 (96c) per node
GPU1× RTX 4090 (24 GB)3× RTX 4090 (72 GB)4× H100 (320 GB) or 8× H200
RAM128 GB DDR5256 GB DDR5 ECC1 TB DDR5 ECC per node
Storage2× 4 TB NVMe (RAID-1)6× 8 TB NVMe (RAID-10) + SSD cacheTiered: NVMe hot + TB-scale HDD / Lustre
Network1 GbE OK10 GbE with link agg25/100 GbE or IB-HDR for multi-node
Power~1 kW typical / 1.5 kW peak~2 kW typical / 3 kW peak4–6 kW per node
Hardware cost~€4–5k~€12–15k€80–250k per node
Serves8B model, 10–30 concurrent chat users70B Q4 at 60–80 tok/s, 200–500 usersMixtral / Llama-405B, 2k+ users per node

Network + ops footprint

SWOT — an honest read

Strengths

  • iiotd module speaks OPC-UA + Modbus + MQTT Sparkplug B natively
  • Matrix Memory robotics domain sized for motor-control patterns (128/512)
  • Hash-chained audit log supports IATF 16949 traceability
  • Single-RPM install surface — plant IT review-friendly

Weaknesses

  • MES / ERP vendor-specific connectors (SAP ME, Opcenter, Oracle MES) require custom extensions
  • Vision-model integration for visual inspection is customer’s choice of model
  • No native integration with simulation environments (CarMaker, IPG) yet
  • Certified industrial-safety posture not yet audited

Opportunities

  • EU industrial sovereignty push (Chips Act, Net-Zero Industry Act)
  • IATF 16949 revisions increasing traceability requirements
  • Cobot + collaborative-robotics scaling in EU manufacturing
  • Plant-floor WAN failures making cloud-AI intolerable

Threats

  • Siemens AI embedded in TIA Portal
  • Rockwell FactoryTalk AI
  • PTC ThingWorx AI
  • GE Digital Proficy AI

First entry points — concrete value in 30 / 90 / 180 days

30 days

One-cell pilot

Install on a workstation inside the plant network. Connect to one OPC-UA server + one NFS SOP share. Pilot with one shift of one line.

90 days

Program-wide deployment

Tenant = powertrain program. ECU archive indexed. Matrix Memory seeded with prior-year test-bench data. IATF 16949 audit log reviewed by QA.

180 days

Multi-plant rollout

Per-plant tenant, shared group memory. Supplier chinese walls enforced. Cross-program failure-pattern queries quarterly.

Install alpha.3 Industrial AI article Memory article Privacy-first office@eldric.ai
#Automotive #Robotics #OPC-UA #Modbus #SparkplugB #IATF16949 #OnPrem #MatrixMemory