Evaluator-Optimizer Pattern
A writer and an editor working in a loop
When you write something important, you write a draft, then a friend reviews it and says "this part is unclear, fix it." You revise, they check again, and you repeat until it's good. Evaluator-Optimizer is that exact teamwork between two LLMs: one is the writer (optimizer) who produces the work, the other is the editor (evaluator) who judges it and gives feedback — looping until the result passes.
Key points
- One LLM generates; another LLM evaluates against clear criteria.
- The evaluator gives specific feedback, not just pass/fail.
- The loop repeats: generate → evaluate → improve, until good enough.
The one-line definition
Evaluator-Optimizer is a workflow where one LLM (the optimizer/generator) produces an output, a second LLM (the evaluator) judges it against criteria and returns feedback, and the two loop — improving the output each round — until it meets the bar or a limit is hit.
Note: Writer + editor in a loop. Generate, critique, improve, repeat.
The generate-evaluate-improve loop
TASK (e.g. 'translate this poem') │ ▼ ┌──────────────────┐ │ OPTIMIZER ✍️ │ ◄────────────┐ │ generate / │ │ │ improve draft │ │ └────────┬─────────┘ │ │ draft │ ▼ │ feedback ┌──────────────────┐ │ ('fix X, Y') │ EVALUATOR 🧐 │ │ │ good enough? │──────────────┘ │ check criteria │ NOT yet └────────┬─────────┘ │ PASS ✅ ▼ ✅ FINAL OUTPUT
Why feedback (not just a score) matters
PLAIN SCORE ONLY SPECIFIC FEEDBACK ──────────────── ─────────────────
Evaluator: "6/10" Evaluator: "Tone is │ too formal; line 3 ▼ lost the rhyme." Optimizer guesses │ blindly what to fix 🤷 ▼ │ Optimizer fixes the ▼ EXACT problems 🎯 slow, random progress fast, targeted fixes
A tiny code example (read it like English)
Notice the loop has a max-rounds cap so it can never spin forever. Each round the evaluator returns whether it passed plus written feedback, and that feedback is fed back into the optimizer.
def refine(task, max_rounds=3):
feedback = ""
draft = llm(f"Do this task: {task}")
for _ in range(max_rounds):
# EVALUATOR judges against criteria + gives feedback
review = llm(
f"Rate PASS or FAIL and explain why:\n{draft}"
)
if review.startswith("PASS"):
return draft
feedback = review
# OPTIMIZER improves using the feedback
draft = llm(
f"Improve this using the feedback:\n"
f"DRAFT:\n{draft}\nFEEDBACK:\n{feedback}"
)
return draft # best effort after the cap
When should you use evaluator-optimizer?
| Scenario | Recommendation | Why |
|---|---|---|
| You have clear criteria for what 'good' looks like | ✅ Use it | The evaluator can judge against those criteria and guide fixes. |
| Quality clearly improves when feedback is applied (e.g., writing, translation, code) | ✅ Use it | Iterating with critique noticeably raises quality. |
| There's no objective way to tell if the output is 'good' | ❌ Skip it | Without criteria, the evaluator can't give useful feedback. |
| A single call already meets the bar | ❌ Single call | Extra evaluation rounds add cost for no gain. |
Evaluator-Optimizer mistakes beginners make
| Mistake | Consequence | Fix |
|---|---|---|
| No maximum number of rounds. | If it never passes, the loop runs forever and burns money. | Always cap rounds (e.g., 3) and return the best effort when the cap is hit. |
| The evaluator gives only a score, not actionable feedback. | The optimizer guesses blindly and barely improves. | Make the evaluator explain exactly what's wrong and how to fix it. |
| Vague or missing evaluation criteria. | The evaluator is inconsistent and the loop drifts aimlessly. | Give the evaluator a clear, written checklist of what 'good' means. |
Remember these lines
- Evaluator-Optimizer = writer + editor looping until good enough.
- Feedback must be specific and actionable, not just a score.
- Always cap the rounds and define clear evaluation criteria.
Key takeaways
- Evaluator-Optimizer pairs a generating LLM with an evaluating LLM in a loop.
- The evaluator judges the output against criteria and returns actionable feedback.
- The optimizer revises using that feedback until the output passes or a cap is hit.
- Always cap the rounds and give the evaluator clear, written criteria.
Frequently Asked Questions
What is Evaluator-Optimizer Pattern?
When you write something important, you write a draft, then a friend reviews it and says "this part is unclear, fix it." You revise, they check again, and you repeat until it's good. Evaluator-Optimizer is that exact teamwork between two LLMs: one is the writer (optimizer) who produces the work, the other is the…
How does Evaluator-Optimizer Pattern work?
Evaluator-Optimizer is a workflow where one LLM (the optimizer/generator ) produces an output, a second LLM (the evaluator ) judges it against criteria and returns feedback, and the two loop — improving the output each round — until it meets the bar or a limit is hit.
What are the key takeaways about Evaluator-Optimizer Pattern?
Evaluator-Optimizer pairs a generating LLM with an evaluating LLM in a loop. The evaluator judges the output against criteria and returns actionable feedback. The optimizer revises using that feedback until the output passes or a cap is hit. Always cap the rounds and give the evaluator clear, written criteria.
Related topics
Practice this on DevInterviewMaster
Read the full Evaluator-Optimizer Pattern breakdown with interactive demos, quizzes, and Hinglish notes.
800+ system-design, LLD, coding, and design-pattern topics. Unlock everything with Pro (₹499, one-time) or Ultimate (₹999, one-time) — lifetime access, no subscription.