Learning & Adaptation
An agent that gets smarter every day
Imagine a new waiter on their first day. They mix up orders, forget the specials, and take too long. But after a week, they remember regulars, know the menu by heart, and work fast. They learned from experience . A Learning & Adaptation agent does the same: instead of making the same mistakes forever, it remembers what worked, notices what failed, and adjusts so tomorrow it does better than today.
Key points
- A static agent makes the same mistakes forever.
- An adaptive agent improves by learning from feedback and experience.
- The simplest form: remember past successes and reuse them.
What is Learning & Adaptation?
Learning & Adaptation is the pattern where an agent changes its future behaviour based on past results. After each task it asks: "Did that work? What should I do differently next time?" Then it stores that lesson and uses it later. The agent's strategy is not frozen — it evolves.
Note: Static agent = fixed recipe. Adaptive agent = recipe that updates itself.
Static agent vs Adaptive agent
STATIC AGENT (never learns) ADAPTIVE AGENT (learns) ─────────────────────────── ────────────────────────
Task ──► Act ──► Result Task ──► Act ──► Result │ │ ▼ ▼ (forgotten) ┌──────────┐ │ FEEDBACK │ Same task again? │ good/bad?│ │ └────┬─────┘ ▼ │ Same mistake 🌀 ▼ (no memory of ┌──────────┐ the last try) │ STORE │ │ lesson 📓│ └────┬─────┘ │ Same task again? ▼ │ Use the lesson ▼ ──► better! ✅ still wrong 🌀
The learning loop (the heartbeat of adaptation)
┌──────────────────────────────────────────────┐ │ │ ▼ │ ┌─────────┐ ┌─────────┐ ┌──────────┐ ┌────────┐ │ │ TRY │──►│ OBSERVE │──►│ GET │──►│ UPDATE │─┘ │ an │ │ what │ │ FEEDBACK │ │ what │ │ action │ │ happened│ │ good/bad? │ │ I know │ └─────────┘ └─────────┘ └──────────┘ └────────┘ │ next time, pick the │ action that worked ◄──┘
Feedback can come from: • the result itself (did the code run? ✅/❌) • a human thumbs up / thumbs down 👍👎 • a reward score from the environment 🎯
What an adaptive agent needs
- Experience memory — A place to store what it tried and how it went (a list, a database, a vector store). Example: {'task': 'parse date', 'approach': 'regex', 'worked': True}
- A feedback signal — Some way to know if the last action was good or bad. Example: Tests passed, user clicked 'helpful', or a reward number.
- An update rule — Logic that turns feedback into a changed strategy next time. Example: If an approach worked, prefer it; if it failed, avoid or downrank it.
- Few-shot recall — Drop past successful examples into the prompt so the LLM copies what worked. Example: "Last time this kind of task succeeded with these steps: ..."
A tiny adaptive agent (read it like English)
Here the agent keeps a score for each strategy. After every attempt it rewards the winner and penalises the loser, so over time it leans toward whatever works. This is the essence of learning: numbers that move with feedback.
scores = {"regex": 0, "split": 0} # strategies start equal
def choose():
# pick the strategy with the highest score so far
return max(scores, key=scores.get)
def give_feedback(strategy, worked):
scores[strategy] += 1 if worked else -1 # update from result
# pretend 'regex' keeps succeeding and 'split' keeps failing
give_feedback("regex", True)
give_feedback("split", False)
print("Agent will now prefer:", choose()) # -> regex
▶ Try it: an agent that learns which strategy works
Flip a 'liked' value or add new rows to history, then Run to watch the agent re-learn.
# Two ways to greet. The agent doesn't know which the 'user' likes.
# It LEARNS from thumbs up/down feedback and adapts its choice.
scores = {"formal": 0, "casual": 0}
def choose():
return max(scores, key=scores.get) # exploit the best so far
def feedback(strategy, liked):
scores[strategy] += 1 if liked else -1
# Pretend the user secretly prefers 'casual'.
history = [("formal", False), ("casual", True),
("casual", True), ("formal", False)]
for strategy, liked in history:
feedback(strategy, liked)
print(f"tried {strategy:6} liked={liked} scores={scores} -> prefers {choose()}")
print("\nAfter learning, the agent settles on:", choose())
When should an agent learn & adapt?
| Scenario | Recommendation | Why |
|---|---|---|
| The agent repeats similar tasks many times | ✅ Add learning | It can get faster and more accurate with each repeat. |
| You have a clear feedback signal (tests, ratings, rewards) | ✅ Add learning | Feedback is the fuel; without it there's nothing to learn from. |
| A one-off task you'll never repeat | ❌ Skip it | There's no future run to benefit from the lesson. |
| Safety-critical behaviour that must never drift | ⚠️ Be careful | Letting it self-change can introduce unsafe, unpredictable behaviour. |
Learning mistakes beginners make
| Mistake | Consequence | Fix |
|---|---|---|
| Learning with no feedback signal. | The agent 'updates' on noise and gets worse, not better. | Define a clear good/bad signal (tests pass, user rating) before adding learning. |
| Trusting one lucky result. | One fluke makes the agent over-commit to a bad strategy. | Average over several attempts before changing strategy. |
| Never forgetting old lessons. | Stale advice from months ago drowns out what works now. | Decay or expire old experiences so recent feedback matters more. |
Remember these lines
- No feedback signal, no learning. Define good/bad first.
- Cheapest learning = store successful examples and reuse them as few-shot.
- RLHF is just this idea at huge scale: humans rate, the model updates.
Key takeaways
- Learning & Adaptation means the agent changes future behaviour based on past results.
- It needs experience memory, a feedback signal, and an update rule.
- The simplest version: keep scores per strategy and prefer the winners.
- Use it for repeated tasks with clear feedback; be cautious in safety-critical settings.
Frequently Asked Questions
What is Learning & Adaptation?
Imagine a new waiter on their first day. They mix up orders, forget the specials, and take too long.
How does Learning & Adaptation work?
Learning & Adaptation is the pattern where an agent changes its future behaviour based on past results . After each task it asks: "Did that work?
What are the key takeaways about Learning & Adaptation?
Learning & Adaptation means the agent changes future behaviour based on past results. It needs experience memory, a feedback signal, and an update rule. The simplest version: keep scores per strategy and prefer the winners. Use it for repeated tasks with clear feedback; be cautious in safety-critical settings.
Related topics
Practice this on DevInterviewMaster
Read the full Learning & Adaptation breakdown with interactive demos, quizzes, and Hinglish notes.
800+ system-design, LLD, coding, and design-pattern topics. Unlock everything with Pro (₹499, one-time) or Ultimate (₹999, one-time) — lifetime access, no subscription.