4. Compile & Explain

This tutorial covers the two commands that turn a YAML spec into a real, runnable model object: explain and compile. After validating the spec in the previous tutorial, the goal here is to understand what model the spec produces and how that model connects to the HuggingFace ecosystem.

Why Compilation Is Needed

EulerStack's YAML is only a declarative design document — it is not directly executable. For PyTorch to understand it, the spec must pass through an intermediate representation (IR) and then turn into one of two artifacts.

JSON runtime config The spec flattened into a plain dict and saved for inspection / debugging / CI purposes. The model itself is not built — think of it as a JSON-frozen snapshot of the spec.
HuggingFace model directory An actual PreTrainedModel constructed in memory and saved via save_pretrained(). The resulting directory contains config.json and model.safetensors (random weights) and can be loaded by eulerforge or any HF-compatible trainer.

Together these two artifacts are the bridge between "architecture design" and "model training". Freezing the design as JSON makes debugging easy; turning it into an HF directory makes training easy.

Inspecting a Model (explain)

Before generating weights, it is useful to see what the spec will draw. The explain command reads the YAML and prints layer templates, schedule, and parameter estimates as text. No GPU is required.

eulerstack explain --preset configs/presets/arch_expert_research.yml

Sample output:

Model: arch-expert-research
Family hint: full-hybrid-moe
Dims: d_model=1024, n_heads=8, n_kv_heads=4
Vocab: 32000, max seq: 32768, dtype: bfloat16
Positional: rope

Layer templates:
  attn:
    mixer: attention {...}
    ffn: gated_mlp
    ...
  mamba:
    mixer: mamba {variant: mamba2, d_state: 128, ...}
    ffn: gated_mlp
    ...
  retnet:
    mixer: retnet {...}
    ...

Layer schedule:
  mamba x1, retnet x1, mamba x1, retnet x1, mamba x1, attn x1, ...
Total layers: 32

Head: causal_lm, tie_weights=True
Compile target: huggingface
Estimated params: 1.48B (1,481,235,456)
Target params: 1.50B (ratio: 98.7%)

If the output does not match your expectations, edit the YAML. Running explain before the expensive GPU compile step will catch many mistakes cheaply.

Compiling to JSON (Inspection)

compile --print-config outputs the JSON version of the spec to the terminal. --output <file> can be used to save it to a file.

# Print JSON to terminal
eulerstack compile --preset configs/presets/arch_advanced_jamba.yml --print-config

# Save to file
eulerstack compile --preset configs/presets/arch_advanced_jamba.yml --output compiled.json

The JSON includes every field the HuggingFace config expects: model_type, the stack pattern, block defaults, positional encoding, MoE routing, memory settings, and so on. Running diff between two compiled JSONs is an effective way to pinpoint exactly how two presets differ structurally.

Compiling to HuggingFace Model (Export)

The primary artifact is the HF model directory, a folder that the transformers library can load as a standard PyTorch model.

eulerstack compile --preset configs/presets/arch_advanced_jamba.yml --output-dir ./my_jamba_model

A successful export prints something like:

HF model saved: ./my_jamba_model
  model_type: eulerstack
  params: 1.20B (1,199,582,208)
  layers: 32
  load: AutoModelForCausalLM.from_pretrained('./my_jamba_model', trust_remote_code=True)

The directory layout is:

my_jamba_model/
├── config.json                   # EulerStackConfig (all architecture parameters)
├── model.safetensors             # Randomly-initialized weights
├── configuration_eulerstack.py   # Custom config loaded via trust_remote_code
└── modeling_eulerstack.py        # Model class loaded via trust_remote_code

The configuration_eulerstack.py and modeling_eulerstack.py files are saved alongside so that AutoModelForCausalLM.from_pretrained(trust_remote_code=True) can rebuild the model even on a machine that does not have eulerstack installed — HF reads these files at load time.

Loading in Python

The exported directory loads through the standard HF interface.

from transformers import AutoModelForCausalLM, AutoTokenizer
from eulerstack.hf.auto_register import register_eulerstack_auto_classes

register_eulerstack_auto_classes()

# Same API as loading Llama / Mistral
model = AutoModelForCausalLM.from_pretrained(
    "./my_jamba_model",
    trust_remote_code=True,
    dtype="bfloat16",
)

# A standard PreTrainedModel
print(type(model))   # <class 'EulerStackForCausalLM'>
print(model.config.n_layers)

# Verify a forward pass runs
import torch
ids = torch.randint(0, 32000, (1, 128))
with torch.no_grad():
    out = model(ids)
print(out.logits.shape)   # (1, 128, 32000)

register_eulerstack_auto_classes() registers EulerStack's config and model classes with HF's AutoConfig / AutoModel registry. This registration is what allows AutoModelForCausalLM to resolve model_type: eulerstack correctly.

Full Pipeline Summary

All commands combined form this workflow:

YAML preset
    ↓  eulerstack validate --preset X --report     (catch errors early)
    ↓  eulerstack explain --preset X                (confirm structure)
    ↓  eulerstack compile --preset X --output-dir   (export HF model)
HF model directory
    ↓  AutoModelForCausalLM.from_pretrained()       (load in Python)
    ↓  eulerforge train --preset ...                (train on real data)
Trained model

This is EulerStack's core workflow. Edit YAML, run compile --output-dir, and the new architecture is in the HF ecosystem. From that point on, the standard HF tooling (PEFT, TRL, eulerforge) applies without modification.

Validate-Only Mode

If you only want to confirm the spec compiles without actually building the HF model, use --validate-only. This is fast and does not need a GPU.

eulerstack compile --preset my_model.yml --validate-only

Useful for CI pipelines that need to catch YAML preset regressions quickly.

Language Support

All of compile, explain, and validate honor the --lang flag.

eulerstack --lang en compile --preset my_model.yml --output-dir ./my_model
eulerstack --lang zh compile --preset my_model.yml --output-dir ./my_model

Success / error messages are translated. Command names and option names are not translated — this keeps scripting portable across locales.

Runnable Examples

To run the flow in this tutorial as a script rather than typing commands by hand, look under examples/:

examples/01_compile_and_export.py — end-to-end compile / save_pretrained flow
examples/02_load_and_generate.py — load and run text generation
examples/03_architecture_evolution.py — compare multiple presets at once

Next Steps

Tutorial 5: Prepare Data — tokenized training data
Tutorial 6: Sanity Train — short runs to confirm the model learns

← Prev 3. Spec Reference 5. Prepare Training Data Next →