Pattern 04. Judge Node and Quality Loop — Iterate Until Satisfied
Learning Objectives
After completing this tutorial, you will be able to:
- Declare a Judge node and correctly configure the
evaluator_v1schema - Understand the relationship between
route_valuesand edges, and preventJUDGE_ROUTE_COVERAGE_ERROR - Design an
evaluate → revise → evaluateloop pattern and properly bound cycles - Explain the relationship between
max_iterationsand theUNBOUNDED_CYCLEerror - Read and interpret Judge results from
pattern_events.jsonl
Prerequisites
03_simple_linear.mdcompleted (linear pattern writing experience)my_first_pattern.yamlexists (created in tutorial 03)
ls my_first_pattern.yaml
euleragent pattern validate my_first_pattern.yaml
1. Why Do We Need a Judge Node?
In the linear pattern from 03_simple_linear.md, the write node executes only once. Even if the output quality is low, it just terminates. In practice, it should work like this:
Write draft → Quality evaluation
│
├── Good enough → Complete
│
└── Insufficient → Revise → Re-evaluate → ...
The Judge node handles this "evaluate → branch" role. It asks the LLM for an evaluation and routes to different nodes based on the result.
2. Understanding the Judge Node
evaluator_v1 Schema
judge.schema: evaluator_v1 is a built-in evaluation schema. Using this schema requests the following structured JSON response from the Judge LLM:
{
"score": 0.87,
"route": "finalize",
"reason": "Core concepts clearly explained. Code example quality excellent.",
"suggestions": [
"A stronger opening would be beneficial",
"Adding a call-to-action (CTA) to the conclusion would improve completeness"
]
}
score: 0.0~1.0. Used for comparison withpass_threshold(runtime reference value)route: One ofroute_values. Used for actual routing decisionsreason: Evaluation rationale (recorded in logs)suggestions: Automatically passed to the next revise node
Declaring route_values
A Judge node must declare possible routing values in route_values. Every value must have a corresponding edge.
nodes:
- id: evaluate
kind: judge
judge:
schema: evaluator_v1
route_values: [finalize, revise] # Edges required for both
edges:
- from: evaluate
to: finalize
when: "judge.route == finalize" # Covers finalize
- from: evaluate
to: revise
when: "judge.route == revise" # Covers revise
when Condition Syntax
The when DSL used for Judge routing:
# Route value comparison
when: "judge.route == finalize"
when: "judge.route == revise"
# Score threshold (optional)
when: "judge.score >= 0.85"
when: "judge.score < 0.7"
3. Pattern Design
We add a Judge loop to the blog writing pattern created earlier.
[research] → [draft] → [evaluate] → finalize
│
└── judge.route == revise
│
▼
[revise] ──────────────┐
│
◄─────────────────────────────-┘
Complete flow:
┌─────────────────────────────────────────────────────────────────┐
│ blog_with_judge.pattern Flow Diagram │
├─────────────────────────────────────────────────────────────────┤
│ │
│ [research] │
│ │ Topic investigation (llm/execute, exclude: web.search) │
│ │ when: true │
│ ▼ │
│ [draft] │
│ │ Write draft (llm/execute) │
│ │ when: true │
│ ▼ │
│ [evaluate] ──────── when: judge.route == finalize ─────────────┐
│ │ Quality evaluation (judge/evaluator_v1) │
│ │ when: judge.route == revise │
│ ▼ │
│ [revise] │
│ │ Improve draft (llm/execute) │
│ │ when: true │
│ └──────────────────────────► [evaluate] (loop max 3 times) │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ [FINALIZE] Save blog_post.md │◄──┘
│ └──────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
4. Writing the YAML
Create the blog_with_judge.yaml file.
id: blog.quality_loop
version: 1
category: writing
description: "Blog writing pattern with quality loop using Judge node"
defaults:
# max_iterations is required since there is a cycle!
# The evaluate → revise → evaluate loop repeats up to 3 times
max_iterations: 3
# Maximum number of tool calls across the entire execution
max_total_tool_calls: 15
# Judge prefers finalize routing when score is at or above this threshold
# (Runtime reference value - influences Judge LLM's decision)
pass_threshold: 0.85
nodes:
# ── Node 1: research ──
- id: research
kind: llm
runner:
mode: execute
exclude_tools: [web.search, web.fetch]
prompt:
system_append: |
You are a technical researcher. Organize key points about the topic.
Output: Research notes in markdown structure (500-800 words)
artifacts:
primary: research_notes.md
# ── Node 2: draft ──
- id: draft
kind: llm
runner:
mode: execute
exclude_tools: [web.search, web.fetch, shell.exec]
prompt:
system_append: |
You are a technical blog writer.
Write a complete blog post draft based on the research notes.
Requirements:
- Length: 800-1200 words
- Structure: Introduction → Body (3 sections) → Conclusion
- Audience: Experienced developers
- Include code examples
- Markdown format
artifacts:
primary: blog_post.md
# ── Node 3: evaluate (Judge) ──
- id: evaluate
kind: judge # Declared as judge type
judge:
schema: evaluator_v1 # Built-in evaluation schema
# List of possible routing values.
# All must be covered by edges below!
route_values: [finalize, revise]
prompt:
system_append: |
You are a technical blog editor-in-chief. Evaluate the blog post using the following criteria.
Evaluation Criteria:
- Technical accuracy (30%): Is the information accurate and up-to-date?
- Structure and readability (25%): Are the logical flow and section divisions clear?
- Code quality (25%): Are code examples executable and clear?
- Reader value (20%): Can readers learn something new?
Choose 'finalize' if score >= 0.85, otherwise choose 'revise'.
Write suggestions that are specific and actionable.
# ── Node 4: revise ──
- id: revise
kind: llm
runner:
mode: execute
exclude_tools: [web.search, web.fetch, shell.exec]
prompt:
system_append: |
You are a technical blog writer.
Improve the blog post by incorporating the editor's feedback.
Important: Incorporate all of the editor's suggestions,
but maintain the overall structure and technical content of the post.
Rewrite the entire improved post.
artifacts:
primary: blog_post.md # Same filename as draft — overwrites
edges:
# Linear flow
- from: research
to: draft
when: "true"
- from: draft
to: evaluate
when: "true"
# Judge routing — must cover all values in route_values
- from: evaluate
to: finalize
when: "judge.route == finalize" # finalize route
- from: evaluate
to: revise
when: "judge.route == revise" # revise route
# Re-evaluate after revision (loop)
- from: revise
to: evaluate
when: "true"
finalize:
artifact: blog_post.md
5. Validation
euleragent pattern validate blog_with_judge.yaml
Expected output:
Validating pattern: blog_with_judge.yaml
Stage 1 (Schema) PASS
Stage 2 (Structural) PASS
Stage 3 (IR Analysis) PASS
Cycle detected: evaluate → revise → evaluate
Bounded by: max_iterations = 3 ✓
Validation complete: 0 errors, 0 warnings
6. Compilation
euleragent pattern compile blog_with_judge.yaml
Check the cycle information in the compilation output:
{
"id": "blog.quality_loop",
"entry_node": "research",
"cycles": [
{
"path": ["evaluate", "revise", "evaluate"],
"length": 2,
"bounded_by": "max_iterations",
"max_iterations": 3
}
],
"nodes": {
"evaluate": {
"kind": "judge",
"judge": {
"schema": "evaluator_v1",
"route_values": ["finalize", "revise"],
"route_coverage": {
"finalize": { "covered": true, "edge": "evaluate→finalize" },
"revise": { "covered": true, "edge": "evaluate→revise" }
}
}
}
}
}
Verify from route_coverage that all route_values are covered.
7. Execution and Checking Judge Results
Installation and Execution
cp blog_with_judge.yaml .euleragent/patterns/
euleragent pattern run blog.quality_loop my-agent \
--task "Understanding Docker Container Networking — Comparing bridge, host, and overlay modes" \
--project default
Expected output (Judge decides to revise):
[run:g7b3e1d4] Starting pattern: blog.quality_loop
✓ research Completed (11s)
✓ draft Completed (16s) — 1,023 words
✓ evaluate Completed (8s)
score: 0.71 → route: revise
reason: "Insufficient code examples and shallow overlay network explanation"
suggestions:
- "Add docker network create command examples"
- "Add real-world overlay network use cases (Docker Swarm)"
✓ revise Completed (19s) — 1,187 words (improved)
✓ evaluate Completed (7s)
score: 0.89 → route: finalize
reason: "Thorough code examples, clear structure, high reader value"
✓ finalize Completed
Run g7b3e1d4 completed. (2 evaluate iterations)
Artifact: .euleragent/runs/g7b3e1d4/artifacts/blog_post.md
Checking Judge Results in the Event Stream
cat .euleragent/runs/g7b3e1d4/pattern_events.jsonl | grep '"node":"evaluate"'
Output:
{"ts":"2026-02-23T14:32:18Z","event":"node.complete","node":"evaluate","kind":"judge","result":{"score":0.71,"route":"revise","reason":"Insufficient code examples and shallow overlay network explanation","suggestions":["Add docker network create command examples","Add real-world overlay network use cases (Docker Swarm)"]},"iteration":1}
{"ts":"2026-02-23T14:32:52Z","event":"node.complete","node":"evaluate","kind":"judge","result":{"score":0.89,"route":"finalize","reason":"Thorough code examples, clear structure, high reader value","suggestions":[]},"iteration":2}
8. Demonstrating the Error When max_iterations Is Removed
Let us intentionally trigger the error. Remove or comment out defaults.max_iterations in blog_with_judge.yaml.
# defaults:
# max_iterations: 3 ← Remove this line
Run validation:
euleragent pattern validate blog_with_judge.yaml
Expected output:
Validating pattern: blog_with_judge.yaml
Stage 1 (Schema) PASS
Stage 2 (Structural) PASS
Stage 3 (IR Analysis) FAIL
ERROR [UNBOUNDED_CYCLE]
Cycle detected: evaluate → revise → evaluate
This cycle has no bound. Set defaults.max_iterations to limit iterations.
Hint: Add to defaults section:
max_iterations: 3
Validation complete: 1 error, 0 warnings
Restoring max_iterations will resolve the error.
9. The Role of pass_threshold
defaults.pass_threshold: 0.85 is a runtime hint passed to the Judge LLM. It is automatically injected into the Judge's system_append.
Evaluation guidelines the Judge LLM receives:
pass_threshold: 0.85
→ "Choose 'finalize' if score >= 0.85, otherwise choose 'revise'"
pass_threshold does not force the Judge's routing. The Judge LLM makes the final decision. This value serves to communicate the expected quality level to the Judge.
10. Key Concept Explanations
Difference Between Cycles and Linear Flows
Linear pattern (no cycles):
[A] → [B] → [C] → finalize
max_iterations not required. Each node executes exactly once.
Quality loop pattern (with cycles):
[A] → [B] → [C] → finalize
↑ |
└─[D]←─┘ (when C selects revise)
max_iterations required. The C-D-C loop can repeat infinitely.
Can We Use an llm Node Instead of a Judge for Evaluation?
Technically, you can branch from an llm node using when: "true" conditions. However, the judge node has the following advantages:
- Structured responses (score, route, suggestions) are guaranteed via the
evaluator_v1schema - Compile-time coverage verification is possible through
route_valuesdeclaration suggestionsare automatically passed to the next revise node- Evaluation results are recorded in a structured format in the event stream
How max_iterations Works
max_iterations: 3 limits the number of loops within a cycle. If the Judge still selects revise after 3 iterations, the runtime forcibly routes to finalize.
Iteration 1: evaluate(score:0.71, revise) → revise
Iteration 2: evaluate(score:0.79, revise) → revise
Iteration 3: evaluate(score:0.83, revise) → ⚠️ max_iterations reached → forced finalize
At this point, a warning is recorded in the event stream.
11. Practice Exercise: More Granular Routing
The current pattern has only two paths: finalize or revise. Extend it to apply different improvement intensities based on the score.
Exercise: Score-Based 3-Level Routing
# Modify the evaluate node
judge:
schema: evaluator_v1
route_values: [finalize, light_edit, major_rewrite]
# Modify system_append
system_append: |
Evaluation criteria:
- score >= 0.85: 'finalize'
- score 0.65-0.84: 'light_edit' (minor corrections)
- score < 0.65: 'major_rewrite' (complete rewrite)
# Add new nodes
- id: light_edit
kind: llm
runner:
mode: execute
prompt:
system_append: |
Incorporate only the top 2 suggestions from the editor
and improve the blog post with minimal modifications.
- id: major_rewrite
kind: llm
runner:
mode: execute
prompt:
system_append: |
Completely rewrite the blog post.
Re-read research_notes.md and write with an entirely new approach.
# Add new edges
- from: evaluate
to: finalize
when: "judge.route == finalize"
- from: evaluate
to: light_edit
when: "judge.route == light_edit"
- from: evaluate
to: major_rewrite
when: "judge.route == major_rewrite"
- from: light_edit
to: evaluate
when: "true"
- from: major_rewrite
to: evaluate
when: "true"
euleragent pattern validate blog_three_routes.yaml
Verify that JUDGE_ROUTE_COVERAGE_ERROR does not occur.
Next Steps
You now understand Judge loops. Next, learn how to safely integrate real web search into patterns.
- Next tutorial: 05_web_research.md — Use web search under HITL approval with
force_tool: web.search - Human review: 06_human_gate.md — Create a gate where humans evaluate directly instead of a Judge
- 3-way routing: 07_multi_route.md — Explore the above practice exercise in greater depth