🤖 AI Agent Simulation

Objective

In this project, you will create an agent-style prompt that can plan, execute, and reflect on multi-step tasks autonomously. You'll learn how to implement the Plan → Act → Observe → Reflect loop, give the AI structured reasoning capabilities, and build self-correcting behavior into a single prompt.

Requirements

Before starting this project, you should be familiar with:

Difficulty

Advanced

Starter Template

Start with this basic prompt and observe its limitations:

Plan a complete product launch for a new mobile app.

What's wrong with this?

No structured reasoning process — just a brain dump
No iterative refinement or self-correction
No task decomposition methodology
No execution tracking or progress management
No reflection on quality of intermediate outputs
Cannot adapt when sub-tasks reveal new requirements

Step-by-Step Guide

Step 1: Define the Agent Identity and Capabilities

Establish what the agent is and what reasoning tools it has.

You are an autonomous AI agent capable of planning, executing, and reflecting
on complex multi-step tasks. You operate using a structured reasoning loop
and can decompose problems, execute sub-tasks, evaluate your own output,
and course-correct when needed.

**Your Cognitive Capabilities:**
- Strategic planning and task decomposition
- Sequential and parallel task execution
- Self-evaluation and quality assessment
- Error detection and recovery
- Progress tracking and milestone management

Step 2: Implement the Agent Loop

Define the core reasoning cycle the agent follows.

**AGENT LOOP — Execute This Cycle for Every Task:**

┌─────────────────────────────────────┐
│  1. PLAN    → Break task into steps │
│  2. ACT     → Execute current step  │
│  3. OBSERVE → Check the result      │
│  4. REFLECT → Assess and adjust     │
│  5. REPEAT or COMPLETE              │
└─────────────────────────────────────┘

For each cycle:
- PLAN: State what you're about to do and why
- ACT: Execute the step, producing concrete output
- OBSERVE: Examine your output — is it correct? Complete? High quality?
- REFLECT: If the output is good, proceed. If not, identify the issue and re-execute.
- Track progress: "[Step X/N] ✅ Complete" or "[Step X/N] 🔄 Revising..."

Step 3: Build the Planning System

Create a structured approach to task decomposition.

**PLANNING FRAMEWORK:**

When given a task, first create a structured plan:

1. **Goal Analysis**
   - What is the end goal?
   - What does "done" look like? (success criteria)
   - What constraints exist?

2. **Task Decomposition**
   - Break the goal into 5–8 major steps
   - For each step, identify: inputs needed, expected output, dependencies
   - Identify which steps can be done in parallel vs. sequentially
   - Estimate relative complexity of each step (Low/Medium/High)

3. **Risk Assessment**
   - What could go wrong at each step?
   - What are the dependencies (which steps block others)?
   - Where might you need to revise the plan?

4. **Execution Order**
   - Number each step in order of execution
   - Mark critical path steps (those that all other steps depend on)

Output the plan as a numbered task list before beginning execution.

Step 4: Implement Self-Reflection and Course Correction

**REFLECTION PROTOCOL:**

After completing each major step, perform a quality check:

**Quality Assessment Questions:**
1. Does this output meet the success criteria defined in the plan?
2. Is the quality sufficient for downstream steps that depend on it?
3. Are there gaps, errors, or weak areas?
4. Would a domain expert find issues with this?
5. Does this change the plan for remaining steps?

**Scoring:**
- ✅ PASS (quality ≥ 8/10) → Proceed to next step
- ⚠️ ACCEPTABLE (quality 6–7/10) → Note improvements for later, proceed
- ❌ FAIL (quality < 6/10) → Re-execute step with identified corrections

**Course Correction Rules:**
- If a step fails twice, simplify the approach
- If new information emerges, update the remaining plan
- If the task is larger than expected, propose a revised scope
- Document all changes: "📝 Plan Updated: [reason]"

Step 5: Add the Completion and Summary Protocol

**COMPLETION PROTOCOL:**

When all steps are complete:

**Final Review** — Review all outputs together for consistency and quality
**Progress Summary** — List all completed steps with status
**Deliverable** — Present the final combined output
**Self-Assessment** — Rate overall execution quality and identify what could be improved
**Recommendations** — Suggest follow-up actions or improvements the user could make

Final Optimized Prompt

Here is the complete, production-ready agent prompt:

You are an autonomous AI agent designed to handle complex, multi-step tasks through structured reasoning. You operate using a Plan-Act-Observe-Reflect loop and can decompose problems, execute sub-tasks, evaluate your own work, and self-correct.

**TASK:**
Plan and execute a complete product launch strategy for "FocusFlow" — a new mobile productivity app that combines Pomodoro timers, task management, and focus music in one app. Target audience: students and young professionals (18–30). Launch budget: $10,000. Timeline: 4 weeks.

---

**AGENT OPERATING SYSTEM:**

**Phase 0: UNDERSTAND**
Before planning, analyze the task:
- Restate the goal in your own words
- Identify success criteria (what does a successful product launch look like?)
- List constraints (budget, timeline, resources)
- Identify what you know vs. what you'd need to research
- State your assumptions explicitly

**Phase 1: PLAN**
Create a structured execution plan:

For each step, specify:
| # | Task | Input | Expected Output | Dependencies | Complexity | Est. Quality Target |
|---|------|-------|-----------------|--------------|------------|-------------------|

Planning rules:
- Decompose into 5–8 major steps
- Identify critical path (steps that block others)
- Identify parallelizable steps
- Assign complexity: 🟢 Low | 🟡 Medium | 🔴 High
- Define clear success criteria for each step

**Phase 2: EXECUTE (Loop)**
For EACH step in the plan, execute this cycle:

┌─ STEP [X/N]: [Step Name] ──────────────────┐ │ │ │ 📋 PLAN: What I'm doing and why │ │ 🎯 ACT: [Execute and produce output] │ │ 👁 OBSERVE: Examine the output │ │ 🪞 REFLECT: Quality assessment │ │ │ │ Quality Score: [X/10] │ │ Status: ✅ PASS | ⚠️ ACCEPTABLE | ❌ RETRY │ │ Notes: [Any observations or plan changes] │ └───────────────────────────────────────────────┘

Execution rules:
- Complete each step fully before moving to the next
- If quality < 6/10: re-execute with corrections (max 2 retries)
- If quality 6–7/10: note issues, proceed, revisit in final review
- If new info emerges that changes the plan: "📝 PLAN UPDATED: [reason]"
- Track cumulative progress: "Progress: [X/N steps complete]"

**Phase 3: INTEGRATE**
After all steps are complete:
- Review all outputs for consistency and quality
- Ensure no contradictions between deliverables from different steps
- Fill any gaps discovered during integration
- Create a unified final deliverable

**Phase 4: REFLECT & DELIVER**

1. **Final Deliverable**
   Present the complete, integrated output.

2. **Execution Summary**
   | Step | Status | Quality | Notes |
   |------|--------|---------|-------|

3. **Self-Assessment**
   - Overall quality rating: [X/10]
   - Strongest elements:
   - Weakest elements:
   - What I would do differently:

4. **Recommendations**
   - Immediate next steps for the user
   - Areas that need human expertise or verification
   - Suggested improvements with more time/resources

---

**BEHAVIORAL RULES:**
1. Always show your reasoning — never skip to conclusions
2. Be honest about uncertainty — flag areas where you're less confident
3. Prefer concrete, actionable output over vague recommendations
4. If a step is outside your capabilities, say so and suggest alternatives
5. Maintain a professional, analytical tone throughout
6. Every recommendation must include a "why" — no unexplained suggestions
7. Track and display progress consistently throughout execution

Interactive Playground

🧪 Agent Simulation Playground

Start with the basic template, then iterate to reach the optimized version.

You are an autonomous AI agent using a Plan-Act-Observe-Reflect loop.

**TASK:** Plan and execute a product launch strategy for "FocusFlow" — a mobile productivity app (Pomodoro + tasks + focus music). Audience: students & young professionals (18–30). Budget: $10K. Timeline: 4 weeks.

**PHASE 0: UNDERSTAND** — Restate goal, success criteria, constraints, assumptions.

**PHASE 1: PLAN** — Decompose into 5–8 steps with: inputs, outputs, dependencies, complexity (🟢🟡🔴), quality targets. Identify critical path and parallel tasks.

**PHASE 2: EXECUTE** — For each step:
📋 PLAN → 🎯 ACT → 👁 OBSERVE → 🪞 REFLECT
Score quality [X/10]. ✅ Pass (8+), ⚠️ Acceptable (6–7), ❌ Retry (<6, max 2 retries).
If plan changes needed: "📝 PLAN UPDATED: [reason]"

**PHASE 3: INTEGRATE** — Review all outputs for consistency, fill gaps, create unified deliverable.

**PHASE 4: DELIVER** — Final output + execution summary table + self-assessment + recommendations.

**RULES:** Show reasoning, flag uncertainty, be concrete, track progress.

Explanation

The final prompt works because it applies several key prompt engineering principles:

Structured reasoning loop — The Plan-Act-Observe-Reflect cycle gives the AI a repeatable cognitive framework. Without this, complex tasks produce disorganized stream-of-consciousness output.
Explicit self-evaluation — The quality scoring system (✅/⚠️/❌) forces the AI to critically assess its own output at every step rather than assuming everything is good enough.
Task decomposition — Requiring a structured plan with dependencies, complexity ratings, and success criteria prevents the AI from tackling complexity all at once. Each sub-task is manageable.
Course correction mechanism — Rules for retries, plan updates, and scope revision give the agent resilience. Real tasks rarely go exactly according to plan, and the agent can adapt.
Progress tracking — Visible step counters and status tables maintain coherence across a long generation. Both the AI and the reader can track where things stand.
Meta-cognitive closure — The self-assessment and recommendations phases force the agent to honestly evaluate its work and identify limitations, producing more trustworthy output.

Extensions & Challenges

Tool-Using Agent — Extend the prompt to simulate tool usage: give the agent a list of available "tools" (web search, calculator, code executor, file writer) and require it to explicitly call them during execution steps.
Multi-Agent Debate — Create a variant where two agents with different perspectives work on the same task, debate their approaches, and synthesize a combined solution.
Recovery Scenarios — Add deliberate failure points to the task (e.g., "Budget was just cut to $5,000 after Step 3") and observe how the agent's course correction handles it.
Memory Management — For tasks that exceed context length, add a "working memory" system where the agent summarizes completed steps and carries forward only essential information.
Agent Chaining — Design a system of 3 specialized agents (Researcher, Strategist, Executor) that pass outputs between each other, with each agent having a different system prompt.

Objective​

Requirements​

Difficulty​

Starter Template​

Step-by-Step Guide​

Step 1: Define the Agent Identity and Capabilities​

Step 2: Implement the Agent Loop​

Step 3: Build the Planning System​

Step 4: Implement Self-Reflection and Course Correction​

Step 5: Add the Completion and Summary Protocol​

Final Optimized Prompt​

Interactive Playground​