4. Compile & Explain
This tutorial covers the two commands that turn a YAML spec into a real,
runnable model object: explain and compile. After validating the spec
in the previous tutorial, the goal here is to understand what model the spec
produces and how that model connects to the HuggingFace ecosystem.
Why Compilation Is Needed
EulerStack's YAML is only a declarative design document — it is not directly executable. For PyTorch to understand it, the spec must pass through an intermediate representation (IR) and then turn into one of two artifacts.
-
JSON runtime config The spec flattened into a plain dict and saved for inspection / debugging / CI purposes. The model itself is not built — think of it as a JSON-frozen snapshot of the spec.
-
HuggingFace model directory An actual
PreTrainedModelconstructed in memory and saved viasave_pretrained(). The resulting directory containsconfig.jsonandmodel.safetensors(random weights) and can be loaded by eulerforge or any HF-compatible trainer.
Together these two artifacts are the bridge between "architecture design" and "model training". Freezing the design as JSON makes debugging easy; turning it into an HF directory makes training easy.
Inspecting a Model (explain)
Before generating weights, it is useful to see what the spec will draw. The
explain command reads the YAML and prints layer templates, schedule, and
parameter estimates as text. No GPU is required.
eulerstack explain --preset configs/presets/arch_expert_research.yml
Sample output:
Model: arch-expert-research
Family hint: full-hybrid-moe
Dims: d_model=1024, n_heads=8, n_kv_heads=4
Vocab: 32000, max seq: 32768, dtype: bfloat16
Positional: rope
Layer templates:
attn:
mixer: attention {...}
ffn: gated_mlp
...
mamba:
mixer: mamba {variant: mamba2, d_state: 128, ...}
ffn: gated_mlp
...
retnet:
mixer: retnet {...}
...
Layer schedule:
mamba x1, retnet x1, mamba x1, retnet x1, mamba x1, attn x1, ...
Total layers: 32
Head: causal_lm, tie_weights=True
Compile target: huggingface
Estimated params: 1.48B (1,481,235,456)
Target params: 1.50B (ratio: 98.7%)
If the output does not match your expectations, edit the YAML. Running explain
before the expensive GPU compile step will catch many mistakes cheaply.
Compiling to JSON (Inspection)
compile --print-config outputs the JSON version of the spec to the terminal.
--output <file> can be used to save it to a file.
# Print JSON to terminal
eulerstack compile --preset configs/presets/arch_advanced_jamba.yml --print-config
# Save to file
eulerstack compile --preset configs/presets/arch_advanced_jamba.yml --output compiled.json
The JSON includes every field the HuggingFace config expects: model_type, the
stack pattern, block defaults, positional encoding, MoE routing, memory settings,
and so on. Running diff between two compiled JSONs is an effective way to
pinpoint exactly how two presets differ structurally.
Compiling to HuggingFace Model (Export)
The primary artifact is the HF model directory, a folder that the
transformers library can load as a standard PyTorch model.
eulerstack compile --preset configs/presets/arch_advanced_jamba.yml --output-dir ./my_jamba_model
A successful export prints something like:
HF model saved: ./my_jamba_model
model_type: eulerstack
params: 1.20B (1,199,582,208)
layers: 32
load: AutoModelForCausalLM.from_pretrained('./my_jamba_model', trust_remote_code=True)
The directory layout is:
my_jamba_model/
├── config.json # EulerStackConfig (all architecture parameters)
├── model.safetensors # Randomly-initialized weights
├── configuration_eulerstack.py # Custom config loaded via trust_remote_code
└── modeling_eulerstack.py # Model class loaded via trust_remote_code
The configuration_eulerstack.py and modeling_eulerstack.py files are saved
alongside so that AutoModelForCausalLM.from_pretrained(trust_remote_code=True)
can rebuild the model even on a machine that does not have eulerstack
installed — HF reads these files at load time.
Loading in Python
The exported directory loads through the standard HF interface.
from transformers import AutoModelForCausalLM, AutoTokenizer
from eulerstack.hf.auto_register import register_eulerstack_auto_classes
register_eulerstack_auto_classes()
# Same API as loading Llama / Mistral
model = AutoModelForCausalLM.from_pretrained(
"./my_jamba_model",
trust_remote_code=True,
dtype="bfloat16",
)
# A standard PreTrainedModel
print(type(model)) # <class 'EulerStackForCausalLM'>
print(model.config.n_layers)
# Verify a forward pass runs
import torch
ids = torch.randint(0, 32000, (1, 128))
with torch.no_grad():
out = model(ids)
print(out.logits.shape) # (1, 128, 32000)
register_eulerstack_auto_classes() registers EulerStack's config and model
classes with HF's AutoConfig / AutoModel registry. This registration is
what allows AutoModelForCausalLM to resolve model_type: eulerstack
correctly.
Full Pipeline Summary
All commands combined form this workflow:
YAML preset
↓ eulerstack validate --preset X --report (catch errors early)
↓ eulerstack explain --preset X (confirm structure)
↓ eulerstack compile --preset X --output-dir (export HF model)
HF model directory
↓ AutoModelForCausalLM.from_pretrained() (load in Python)
↓ eulerforge train --preset ... (train on real data)
Trained model
This is EulerStack's core workflow. Edit YAML, run compile --output-dir, and
the new architecture is in the HF ecosystem. From that point on, the standard
HF tooling (PEFT, TRL, eulerforge) applies without modification.
Validate-Only Mode
If you only want to confirm the spec compiles without actually building the HF
model, use --validate-only. This is fast and does not need a GPU.
eulerstack compile --preset my_model.yml --validate-only
Useful for CI pipelines that need to catch YAML preset regressions quickly.
Language Support
All of compile, explain, and validate honor the --lang flag.
eulerstack --lang en compile --preset my_model.yml --output-dir ./my_model
eulerstack --lang zh compile --preset my_model.yml --output-dir ./my_model
Success / error messages are translated. Command names and option names are not translated — this keeps scripting portable across locales.
Runnable Examples
To run the flow in this tutorial as a script rather than typing commands by
hand, look under examples/:
examples/01_compile_and_export.py— end-to-end compile / save_pretrained flowexamples/02_load_and_generate.py— load and run text generationexamples/03_architecture_evolution.py— compare multiple presets at once
Next Steps
- Tutorial 5: Prepare Data — tokenized training data
- Tutorial 6: Sanity Train — short runs to confirm the model learns