Preview in 5.0.x

One daemon,
compiled graph.

The structured-ML workloads — policy execution, forecasting, vision-language encoding, associative retrieval — get a native compile path on the 5.0.x train. No Python sidecar, no separate model server. The same single daemon serves them via a compiled graph. Faster cold-start, less memory pressure, fewer moving parts to monitor.

Today


What changes

Native, not orchestrated.

5.0 already ships the structured-ML workloads, but the implementation today routes through a Python runtime for portability. That's fine, and it works, but the overhead is real — cold-start time, memory footprint, an extra process to manage and watchdog.

An upcoming 5.0.x patch ships a native execution path. The same model graph compiles directly into the platform's inference daemon — no Python interpreter on the hot path, no separate model server, no inter-process round-trip. Same API surface, same model files, same workload semantics; what changes is what happens behind the curtain.


Where it shows

Visible on Apple Silicon. Useful on Linux.

On Apple Silicon, the native path runs on the unified-memory architecture without the host-to-GPU transfer cost that the Python runtime imposes. The platform's macOS GUI — and the standalone deployment story on a developer's laptop — gets noticeably faster cold-start and a smaller resident set.

On Linux servers, the win is operational rather than visible: one less process to monitor, one less surface for sysadmins to debug, and a smaller memory ceiling for the same throughput. Customers running the structured-ML workloads at scale get more headroom on the same hardware.


What's pending

Honest gates on this page.

Still in flight

  • Compile path for policy execution (already byte-exact validated against the Python reference)
  • Compile path for forecasting
  • Compile path for vision-language encoding
  • Compile path for associative retrieval (already native today)
  • Apple Silicon: unified-memory optimisations
  • Linux: CUDA build alongside the CPU build, parity with the existing structured-ML daemon
  • Migration path for existing 5.0 customers (config flag flip, no model conversion)

This page updates as each piece lands. The release notes are the formal cut.


Read next.

For the structured-ML stack today, see xLSTM for IoT. For the inference daemon overall, see smart memory inference. For the full 5.0.x roadmap, see what's next in 5.0.x.