AI model fine-tuning, LoRA adapters, RLHF alignment, training chains, and latent reasoning techniques. Train and customize models with production-grade infrastructure you fully control.
Port 8898Choose from Unsloth, Axolotl, TRL, DeepSpeed, MLX, or llama.cpp. Each backend optimized for different hardware and training scenarios.
LoRA, QLoRA, SFT, DPO, RLHF, PPO, and full fine-tuning. From lightweight adapters to complete model retraining.
Visual node-based pipeline configuration. Chain data sources, AI generators, validators, and trainers into automated workflows.
COCONUT, Quiet-STaR, Pause Tokens, Hidden CoT, and DeepSeek DSA. Train models to reason in latent space.
Six production-ready backends covering CUDA, ROCm, Apple Silicon, and CPU training. The Training Worker auto-detects available backends from the Python virtual environment.
| Backend | Description | Hardware | Methods | Key Advantage |
|---|---|---|---|---|
| Unsloth | Fast LoRA/QLoRA training | CUDA | LoRA QLoRA SFT | 2x training speedup |
| Axolotl | Flexible YAML-based training | CUDA, ROCm | LoRA QLoRA SFT Full | YAML config, broad model support |
| TRL | HuggingFace RLHF library | CUDA | SFT DPO RLHF PPO | Full alignment pipeline |
| DeepSpeed | Distributed training | CUDA, multi-GPU | SFT Full LoRA | Multi-node, ZeRO optimization |
| MLX | Apple Silicon training | Apple MPS | LoRA SFT | Native macOS, unified memory |
| llama.cpp | GGUF-based training | CUDA, CPU, MPS | LoRA SFT | GGUF format, low resource |
Low-Rank Adaptation. Inserts small trainable matrices into frozen model layers. Efficient, fast, and produces lightweight adapters.
Quantized LoRA. Combines 4-bit quantization with LoRA for memory-efficient training on consumer GPUs.
Supervised Fine-Tuning. Standard instruction tuning on labeled data to teach models specific behaviors and formats.
Direct Preference Optimization. Aligns models using human preference data without a separate reward model.
Reinforcement Learning from Human Feedback. Full alignment pipeline with reward modeling and policy optimization.
Proximal Policy Optimization. The RL algorithm used in RLHF for stable policy updates during alignment training.
Update all model parameters. Maximum capability but requires substantial compute and memory resources.
Latent reasoning techniques train models to reason in continuous latent space rather than producing verbose text-based reasoning chains. This enables faster, more efficient inference while maintaining or improving reasoning quality.
Chain of Continuous Thought. Replaces discrete token reasoning with continuous latent representations. The model learns to reason entirely in embedding space.
Self-Taught Reasoner. Trains models to generate internal thought tokens before each response token, creating implicit chain-of-thought reasoning.
Inserts learnable <pause> tokens before the model generates its answer, giving the transformer extra computation steps for complex reasoning.
Distills explicit chain-of-thought reasoning into hidden representations. Models learn to compress verbose reasoning into dense internal states.
Dynamic Sparse Attention. Enables efficient reasoning by dynamically selecting which attention heads participate in each computation step.
Training chains allow you to compose multi-step pipelines that connect data sources, AI-powered data generators, quality validators, and training backends into automated workflows. Create chains via the API or the web dashboard.
Load training data from local files, Data Worker storage, or remote URLs. Supports JSONL, Alpaca, ShareGPT, and OpenAI formats.
Use an LLM to generate synthetic training data from source documents. Configurable count, model, and generation template.
Quality filters and deduplication. Remove low-quality samples, check format compliance, and validate against schemas.
Execute training with any supported backend and method. Full hyperparameter configuration and checkpoint management.
Run benchmark evaluations on trained models. Compare against baselines and generate quality reports.
Automatically deploy trained models or adapters to inference workers in the Eldric cluster.
Pre-built chain templates for common training workflows, available via /api/v1/chains/templates.
| Template | Description | Nodes |
|---|---|---|
| qa-pipeline | Generate Q&A pairs from documents and train a model | data_source → ai_generator → trainer |
| alignment-pipeline | SFT followed by DPO alignment | data_source → trainer(SFT) → trainer(DPO) |
| rag-tuning | Fine-tune on knowledge base with RAG validation | data_source → ai_generator → validator → trainer |
| code-tuning | Train coding capabilities from repository data | data_source → ai_generator → validator → trainer → evaluator |
Health check and worker status
Web dashboard overview
Training jobs dashboard page
Training chains dashboard page
Backend status dashboard page
List all training jobs
Create a new training job
Get training job details and progress
Cancel a running training job
Pause a running training job
Resume a paused training job
Get training job log output
Get training metrics (loss, learning rate, etc.)
List all training chains
Create a new training chain
Get chain configuration and status
Delete a training chain
Execute a training chain
Get available chain templates
List available training backends and their status
Get GPU information, memory, and utilization
The Training Worker uses a Python virtual environment for all training backends. The venv is auto-detected at ~/.config/eldric/training-venv. Backend availability depends on which packages are installed.
| Backend | Required Packages | Hardware |
|---|---|---|
| TRL | trl, transformers, peft, accelerate |
CUDA |
| MLX | mlx, mlx-lm |
macOS (Apple Silicon only) |
| Axolotl | axolotl |
CUDA, ROCm |
| Unsloth | unsloth |
CUDA (required) |
| DeepSpeed | deepspeed |
CUDA (required) |
| llama.cpp | System binary (llama-finetune) |
CUDA, CPU, MPS |
The Training Worker integrates with Eldric Data Workers for centralized dataset storage, model artifact management, and cluster-wide data access.
Store and retrieve training datasets from Data Workers. Supports multi-tenant isolation, versioning, and quota management.
Save trained models, LoRA adapters, and checkpoints to Data Workers for cluster-wide distribution and deployment.
Real-time GPU monitoring for training job resource management. Supports NVIDIA CUDA GPUs and Apple Silicon unified memory.
Real-time VRAM usage monitoring per GPU and per training job. Alerts on memory pressure.
GPU compute utilization, temperature, power draw, and clock speeds via nvidia-smi or system profiler.
Distribute training across multiple GPUs with DeepSpeed ZeRO stages or data parallelism.
The Training Worker includes a built-in web dashboard at http://localhost:8898/dashboard for monitoring and managing training jobs, chains, and backends.
View all training jobs with real-time progress, loss curves, and resource utilization. Start, pause, resume, and cancel jobs from the browser.
/dashboard/jobs
Visual pipeline editor for creating and managing training chains. View execution history and node-level status for each chain run.
/dashboard/chains
Monitor installed training backends, GPU status, Python environment health, and backend-specific configuration options.
/dashboard/backends
Training Worker capabilities are gated by license tier. The free tier provides MLX-based training for Apple Silicon users.
| Feature | Free | Standard | Professional | Enterprise |
|---|---|---|---|---|
| Training backends | MLX only | Unsloth, TRL | All | All |
| Max epochs | 3 | 10 | Unlimited | Unlimited |
| Max dataset size | 1K samples | 10K samples | 100K samples | Unlimited |
| Training chains | ✗ | ✓ | ✓ | ✓ |
| Latent reasoning | ✗ | ✗ | ✓ | ✓ |
| Multi-GPU | ✗ | ✗ | ✓ | ✓ |
| Distributed training | ✗ | ✗ | ✗ | ✓ |
| Data Worker integration | ✗ | ✓ | ✓ | ✓ |
| Training workers | 1 | 2 | 5 | Unlimited |
Contact license@core.at for custom licensing with unlimited training workers, backends, and dataset sizes. Enterprise licenses include priority support and distributed multi-node training.
Download the Eldric distributed package and start fine-tuning models on your own infrastructure.
Download Eldric View Licensing