Agentic AI PatternsFree to read

Evaluator-Optimizer Pattern

A writer and an editor working in a loop

When you write something important, you write a draft, then a friend reviews it and says "this part is unclear, fix it." You revise, they check again, and you repeat until it's good. Evaluator-Optimizer is that exact teamwork between two LLMs: one is the writer (optimizer) who produces the work, the other is the editor (evaluator) who judges it and gives feedback — looping until the result passes.

Key points

One LLM generates; another LLM evaluates against clear criteria.
The evaluator gives specific feedback, not just pass/fail.
The loop repeats: generate → evaluate → improve, until good enough.

The one-line definition

Evaluator-Optimizer is a workflow where one LLM (the optimizer/generator) produces an output, a second LLM (the evaluator) judges it against criteria and returns feedback, and the two loop — improving the output each round — until it meets the bar or a limit is hit.

Note: Writer + editor in a loop. Generate, critique, improve, repeat.

The generate-evaluate-improve loop

TASK (e.g. 'translate this poem') │ ▼ ┌──────────────────┐ │ OPTIMIZER ✍️ │ ◄────────────┐ │ generate / │ │ │ improve draft │ │ └────────┬─────────┘ │ │ draft │ ▼ │ feedback ┌──────────────────┐ │ ('fix X, Y') │ EVALUATOR 🧐 │ │ │ good enough? │──────────────┘ │ check criteria │ NOT yet └────────┬─────────┘ │ PASS ✅ ▼ ✅ FINAL OUTPUT

Why feedback (not just a score) matters

PLAIN SCORE ONLY SPECIFIC FEEDBACK ──────────────── ─────────────────

Evaluator: "6/10" Evaluator: "Tone is │ too formal; line 3 ▼ lost the rhyme." Optimizer guesses │ blindly what to fix 🤷 ▼ │ Optimizer fixes the ▼ EXACT problems 🎯 slow, random progress fast, targeted fixes

A tiny code example (read it like English)

Notice the loop has a max-rounds cap so it can never spin forever. Each round the evaluator returns whether it passed plus written feedback, and that feedback is fed back into the optimizer.

def refine(task, max_rounds=3):
    feedback = ""
    draft = llm(f"Do this task: {task}")

    for _ in range(max_rounds):
        # EVALUATOR judges against criteria + gives feedback
        review = llm(
            f"Rate PASS or FAIL and explain why:\n{draft}"
        )
        if review.startswith("PASS"):
            return draft
        feedback = review

        # OPTIMIZER improves using the feedback
        draft = llm(
            f"Improve this using the feedback:\n"
            f"DRAFT:\n{draft}\nFEEDBACK:\n{feedback}"
        )
    return draft   # best effort after the cap

When should you use evaluator-optimizer?

Scenario	Recommendation	Why
You have clear criteria for what 'good' looks like	✅ Use it	The evaluator can judge against those criteria and guide fixes.
Quality clearly improves when feedback is applied (e.g., writing, translation, code)	✅ Use it	Iterating with critique noticeably raises quality.
There's no objective way to tell if the output is 'good'	❌ Skip it	Without criteria, the evaluator can't give useful feedback.
A single call already meets the bar	❌ Single call	Extra evaluation rounds add cost for no gain.

Evaluator-Optimizer mistakes beginners make

Mistake	Consequence	Fix
No maximum number of rounds.	If it never passes, the loop runs forever and burns money.	Always cap rounds (e.g., 3) and return the best effort when the cap is hit.
The evaluator gives only a score, not actionable feedback.	The optimizer guesses blindly and barely improves.	Make the evaluator explain exactly what's wrong and how to fix it.
Vague or missing evaluation criteria.	The evaluator is inconsistent and the loop drifts aimlessly.	Give the evaluator a clear, written checklist of what 'good' means.

Remember these lines

Evaluator-Optimizer = writer + editor looping until good enough.
Feedback must be specific and actionable, not just a score.
Always cap the rounds and define clear evaluation criteria.

Key takeaways

Evaluator-Optimizer pairs a generating LLM with an evaluating LLM in a loop.
The evaluator judges the output against criteria and returns actionable feedback.
The optimizer revises using that feedback until the output passes or a cap is hit.
Always cap the rounds and give the evaluator clear, written criteria.

Frequently Asked Questions

What is Evaluator-Optimizer Pattern?

When you write something important, you write a draft, then a friend reviews it and says "this part is unclear, fix it." You revise, they check again, and you repeat until it's good. Evaluator-Optimizer is that exact teamwork between two LLMs: one is the writer (optimizer) who produces the work, the other is the…

How does Evaluator-Optimizer Pattern work?

Evaluator-Optimizer is a workflow where one LLM (the optimizer/generator ) produces an output, a second LLM (the evaluator ) judges it against criteria and returns feedback, and the two loop — improving the output each round — until it meets the bar or a limit is hit.

What are the key takeaways about Evaluator-Optimizer Pattern?

Evaluator-Optimizer pairs a generating LLM with an evaluating LLM in a loop. The evaluator judges the output against criteria and returns actionable feedback. The optimizer revises using that feedback until the output passes or a cap is hit. Always cap the rounds and give the evaluator clear, written criteria.

Browse all Agentic AI Patterns topics →

Practice this on DevInterviewMaster

Read the full Evaluator-Optimizer Pattern breakdown with interactive demos, quizzes, and Hinglish notes.

Open the interactive topic →

800+ system-design, LLD, coding, and design-pattern topics. Unlock everything with Pro (₹499, one-time) or Ultimate (₹999, one-time) — lifetime access, no subscription.

Evaluator-Optimizer Pattern

Key points

The one-line definition

The generate-evaluate-improve loop

Why feedback (not just a score) matters

A tiny code example (read it like English)

When should you use evaluator-optimizer?

Evaluator-Optimizer mistakes beginners make

Remember these lines

Key takeaways

Frequently Asked Questions

Related topics

Practice this on DevInterviewMaster