Home > EulerForge > Tutorials > Getting Started

Getting Started

A quick-start guide for fine-tuning LLMs with EulerForge. For detailed explanations of each injection strategy, refer to the strategy-specific tutorials.


1. Installation and Environment

Prerequisites

Installation

# Clone the repository
git clone <repo_url>
cd eulerforge

# Install in editable mode
pip install -e .

# For development (includes tests)
pip install -e ".[dev]" pytest

Verify Installation

python -c "import eulerforge; print('OK')"
eulerforge train --help

2. Data Preparation

First, run the Data Preprocessing Guide. It converts the raw data in data/ into EulerForge standard raw JSONL format.

Provided Data and Purposes

File Format Purpose
data/sft_10k_raw.jsonl {prompt, response} SFT training (01-04), PPO (08)
data/dpo_10k_raw.jsonl {prompt, chosen, rejected} DPO training (05)
data/dpo_10k_raw.jsonl {prompt, chosen, rejected} ORPO (06), RM (07)

Using Raw Data

Specify data.format=raw and the data will be automatically tokenized during training:

eulerforge train --preset configs/presets/<preset>.yml \
    --set data.format=raw \
    --set data.task=sft \
    --set data.path=data/sft_10k_raw.jsonl \
    --set data.max_length=512

Details: Data Preprocessing Guide


3. Core Concepts: Where / What / When

EulerForge organizes fine-tuning from three perspectives:

Perspective Question Responsible Component
Where Where in the model to inject? BackboneAdapter -- Explores transformer blocks, FFN, attention
What What to inject? InjectionStrategy -- Module transformations like LoRA, MoE, experts
When When to train which parameters? PhaseScheduler -- Controls trainable groups per phase

Each strategy tutorial explains these three perspectives step by step.


4. Backbone Adapters

The correct adapter is automatically selected based on the backbone configuration key:

backbone Value Adapter Compatible Models
qwen3 Qwen3Adapter Qwen2, Qwen2.5, Qwen3 series
llama LlamaAdapter LLaMA 2/3, TinyLlama, Mistral (dense)
mixtral MixtralAdapter Mixtral (native MoE)

5. Choosing an Injection Strategy

Strategy Comparison Table

Strategy Transform Target Phases MoE Section Compatible Models Suitable For
dense_lora Linear -> LoRALinear 1 Not needed All Simple fine-tuning, quick experiments
mixture_lora Linear -> MixtureLoRALinear 2 Required All Multi-task, adaptive LoRA
moe_expert_lora FFN -> MoEFFN + LoRA 3 Required Dense only Dense-to-MoE conversion
native_moe_expert_lora LoRA on existing experts 1 Not needed Mixtral only Native MoE fine-tuning

Strategy Tutorials

Training Method Tutorials

Model-Specific Tutorials

Training Method Selection Table

Purpose Recommended Method Data Reference Model
Basic fine-tuning SFT instruction/response pairs Not needed
Preference alignment (precise) DPO chosen/rejected pairs Needed (adapter disable)
Preference alignment (efficient) ORPO chosen/rejected pairs Not needed
Reward function learning RM chosen/rejected pairs Not needed
RLHF (generate->reward->update) PPO Prompts + RM checkpoint Needed (adapter disable)

Which Strategy Should You Choose?

Is the model already an MoE architecture? (e.g., Mixtral)
  ├── Yes → native_moe_expert_lora
  └── No (Dense model)
        ├── Simple fine-tuning? → dense_lora
        ├── Multi-task/adaptive? → mixture_lora
        └── Convert to MoE structure? → moe_expert_lora

6. Phase Schedule Overview

The phase schedule controls when to train which parameter groups. Define it declaratively in training.phases.

Parameter Groups

Group Training Target
lora LoRA parameters within FFN (lora_A, lora_B)
router MoE router weights
base_ffn Original FFN weights (gate_proj, up_proj, down_proj)
attn_lora LoRA parameters within attention projections

Phase Patterns by Strategy

Strategy Phase Configuration
dense_lora 1 phase: [lora, attn_lora]
mixture_lora 2 phases: [router] -> [lora, attn_lora]
moe_expert_lora 3 phases: [router] -> [lora, attn_lora] -> [lora, attn_lora, router, base_ffn]
native_moe_expert_lora 1 phase: [lora, attn_lora]

For detailed phase configuration and timelines, refer to the "When" section of each strategy tutorial.


7. CLI Quick Start

Basic Training (raw data)

eulerforge train --preset configs/presets/<preset>.yml \
    --set data.format=raw \
    --set data.task=sft \
    --set data.path=data/sft_10k_raw.jsonl \
    --set data.max_length=512

Available Presets

Preset File Strategy Training Type
qwen3.5_0.8b_dense_lora_sft.yml dense_lora SFT
qwen3.5_0.8b_mixture_lora_sft.yml mixture_lora SFT
qwen3.5_0.8b_moe_expert_lora_sft.yml moe_expert_lora SFT
qwen3.5_0.8b_moe_expert_lora_dpo.yml moe_expert_lora DPO
qwen3.5_0.8b_dense_lora_orpo.yml dense_lora ORPO
qwen3.5_0.8b_dense_lora_rm.yml dense_lora RM
qwen3.5_0.8b_dense_lora_ppo.yml dense_lora PPO
llama3_1b_dense_lora_sft.yml dense_lora SFT
tinyllama_1.1b_dense_lora_dpo.yml dense_lora DPO
mixtral_native_expert_lora_sft.yml native_moe_expert_lora SFT

Configuration Overrides

Override any configuration value using dot-path notation:

eulerforge train --preset configs/presets/qwen3.5_0.8b_dense_lora_sft.yml \
    --set data.format=raw \
    --set data.task=sft \
    --set data.path=data/sft_10k_raw.jsonl \
    --set data.max_length=512 \
    --set training.lr=2e-5 \
    --set injection.lora_r=32

Useful CLI Options

Option Description
--validate-only Validate config file only, no model loading
--preflight Load model + apply injection + verify parameters, no training
--debug Enable debug mode
--debug-trainable-names Print trainable parameter names
--debug-every N Print debug info every N steps

For a full CLI reference, see the CLI Documentation.


8. Running Benchmarks

eulerforge bench --target-dir /path/to/checkpoint \
    --ref-model Qwen/Qwen3.5-0.8B-Base \
    --test-data /path/to/test.jsonl \
    --output-file results.jsonl

9. Common Troubleshooting

Symptom Cause Solution
Missing required top-level section backbone/injection/training missing in YAML Add the missing section
Unknown strategy 'xxx' Typo in strategy name Check supported strategy names
No trainable parameters Target keywords don't match Check parameter names with --debug-trainable-names
OOM (out of memory) Insufficient VRAM model.load_precision.mode: int4 (4bit QLoRA), reduce batch_size, reduce lora_r
Phase transition not occurring Steps outside max_train_steps range Check phase step values

For strategy-specific troubleshooting, refer to the "Debugging and Troubleshooting" section of each tutorial.


10. Scratch Pretraining (pretrain)

To train a newly assembled model from scratch (e.g., built with EulerStack) rather than an existing HF model, use the eulerforge pretrain command.

eulerforge pretrain --preset configs/presets/pretrain/eulerstack_hybrid_moe.yml

pretrain is a completely separate pipeline from train, performing full-parameter causal LM training. It does not use LoRA injection or phase scheduling.

Details: 17_pretrain.md


11. Training Pipeline -- From SFT to PPO

EulerForge supports 5 training types. Always start with SFT first:

SFT → DPO (or ORPO) → Deploy       # Most common (2 stages)
SFT → RM → PPO → Deploy             # Full RLHF (3 stages)
SFT → ORPO → RM → PPO → Deploy      # ORPO-based full RLHF

Warning: Applying DPO/ORPO/RM/PPO directly to a base model will degrade performance. Always provide instruction-following ability via SFT first, then proceed with preference learning.

The checkpoint from each stage becomes the model input for the next stage.

Details: 18_training_pipeline.md