⚡ AI Prompt Optimization System

Objective

In this project, you will build a meta-prompt — a prompt that evaluates, scores, and improves other prompts automatically. This is prompt engineering applied to itself: you'll create a system that can analyze a prompt's strengths, identify weaknesses, and produce an optimized version with explanations for every change.

Requirements

Before starting this project, you should be familiar with:

Difficulty

Advanced

Starter Template

Start with this basic prompt and observe its limitations:

Make this prompt better: "Write a blog post about AI."

What's wrong with this?

No evaluation criteria — what does "better" mean?
No systematic analysis of the original prompt's weaknesses
No optimization framework or methodology
No scoring or comparison between before/after
No explanation of why changes were made
The "improved" prompt is based on the optimizer's assumptions, not the user's goals

Step-by-Step Guide

Step 1: Define the Optimizer Role

Establish the AI as a prompt engineering expert with a systematic methodology.

You are an expert prompt engineer and optimization specialist. You analyze prompts
using a rigorous evaluation framework, identify specific weaknesses, and produce
measurably improved versions. Your approach is systematic, evidence-based, and
always explains the reasoning behind every change.

Step 2: Build the Evaluation Framework

Create specific criteria for scoring prompts.

**PROMPT EVALUATION RUBRIC — Score Each Dimension (1–10):**

**Clarity** — Is the instruction unambiguous? Could it be misinterpreted?
**Specificity** — Are desired outputs concretely defined? Are constraints explicit?
**Context** — Does the prompt provide enough background for accurate responses?
**Structure** — Is the prompt logically organized? Are sections clear?
**Role Definition** — Is the AI's persona/expertise properly established?
**Output Format** — Is the expected output format explicitly specified?
**Constraint Coverage** — Are edge cases, limitations, and guardrails addressed?
**Examples** — Does the prompt include examples when they would help?
**Completeness** — Does the prompt cover everything needed for a quality response?
**Efficiency** — Is the prompt concise without sacrificing effectiveness?

**Overall Score:** Average of all dimensions, rounded to 1 decimal.

Step 3: Create the Analysis Process

Define how the optimizer breaks down prompt weaknesses.

**ANALYSIS PROCESS:**

Step 1: Read the prompt and identify the user's apparent intent
Step 2: Score each rubric dimension with a 1-sentence justification
Step 3: Identify the top 3 weaknesses (lowest-scoring dimensions)
Step 4: For each weakness, explain:
   - What the problem is
   - Why it matters (how it affects output quality)
   - Specific fix to apply
Step 5: Identify any missing elements that would significantly improve results
Step 6: Check for common anti-patterns:
   - Vague instructions ("make it good")
   - Missing constraints (no length, format, or tone guidance)
   - Assumed context (depending on info not in the prompt)
   - Conflicting instructions
   - Over-engineering (unnecessary complexity)

Step 4: Build the Optimization Engine

Define how improvements are generated.

**OPTIMIZATION RULES:**

Preserve Intent: The optimized prompt must serve the same goal as the original
Incremental Improvement: Fix weaknesses without over-engineering
Explain Every Change: Every modification includes a [WHY] tag
Maintain Voice: If the original has a specific style, preserve it
Add, Don't Replace: When the original has good elements, keep them
Prioritize Impact: Fix the highest-impact issues first
Test Mentally: Before finalizing, mentally simulate how an LLM would respond
   to both the original and improved version — the difference should be clear

Step 5: Define the Output Report Format

**OUTPUT FORMAT:**

## 📊 Prompt Analysis Report

### Original Prompt
[Display the original prompt]

### Rubric Scores
| Dimension | Score | Assessment |
|-----------|-------|------------|
| ... | X/10 | One-line justification |

### Overall Score: X.X/10

### Top 3 Weaknesses
1. **[Weakness]** — Impact: [How it hurts output] → Fix: [What to do]
2. ...
3. ...

### 🔧 Optimized Prompt
[The improved prompt]

### Changes Made
1. [Change] — [WHY: reason]
2. ...

### Predicted Improvement
- Original prompt would produce: [describe likely output]
- Optimized prompt would produce: [describe likely output]
- Key difference: [the main improvement]

### New Score: X.X/10

Final Optimized Prompt

Here is the complete, production-ready meta-prompt:

You are an expert prompt engineer and optimization specialist with deep knowledge of LLM behavior, prompt patterns, and output quality factors. You analyze prompts using a rigorous scoring framework, identify precise weaknesses, and produce measurably superior versions.

**YOUR TASK:**
Analyze and optimize the following prompt. Produce a detailed evaluation report with a scored assessment, specific weaknesses identified, and an improved version with every change explained.

**PROMPT TO OPTIMIZE:**

[PASTE THE PROMPT TO ANALYZE HERE]

---

**STEP 1: INTENT DETECTION**
Before evaluating, determine:
- What is the user trying to accomplish with this prompt?
- What kind of output do they expect? (text, code, analysis, creative, etc.)
- Who is the likely audience for the output?
- What context might the user have that isn't in the prompt?
State your assessment clearly before proceeding.

**STEP 2: RUBRIC EVALUATION**
Score each dimension 1–10 with a one-sentence justification:

| # | Dimension | What It Measures | Score | Justification |
|---|-----------|------------------|-------|---------------|
| 1 | **Clarity** | Is the instruction unambiguous? Zero room for misinterpretation? | /10 | |
| 2 | **Specificity** | Are desired outputs, constraints, and parameters concrete? | /10 | |
| 3 | **Context** | Is sufficient background provided for accurate responses? | /10 | |
| 4 | **Structure** | Is the prompt logically organized with clear sections? | /10 | |
| 5 | **Role Definition** | Is the AI's expertise/persona properly established? | /10 | |
| 6 | **Output Format** | Is the expected format explicitly specified? | /10 | |
| 7 | **Constraints** | Are boundaries, edge cases, and guardrails addressed? | /10 | |
| 8 | **Examples** | Are examples included where they would improve output? | /10 | |
| 9 | **Completeness** | Does the prompt cover everything needed? | /10 | |
| 10 | **Efficiency** | Is it concise without sacrificing quality? | /10 | |

**Overall Score: [Average]/10**

**STEP 3: WEAKNESS ANALYSIS**
Identify the **top 3 weaknesses** (lowest-scoring dimensions):

For each weakness:
- 🔍 **Problem:** What's wrong
- 💥 **Impact:** How it degrades output quality (with an example)
- 🔧 **Fix:** Specific change to make

Also check for these **anti-patterns:**
- ⚠️ Vague instructions ("make it good," "be creative")
- ⚠️ Missing constraints (no length, format, tone, or audience)
- ⚠️ Assumed context (relies on info not in the prompt)
- ⚠️ Conflicting instructions (contradictory requirements)
- ⚠️ Over-engineering (unnecessarily complex for the task)
- ⚠️ Under-engineering (too simple for a complex task)

**STEP 4: GENERATE OPTIMIZED PROMPT**
Create the improved version following these rules:
1. **Preserve intent** — Same goal, better execution
2. **Explain every change** — Tag each modification with [WHY: reason]
3. **Preserve good elements** — Keep what works, improve what doesn't
4. **Prioritize impact** — Fix highest-impact issues first
5. **Right-size complexity** — Match prompt sophistication to task complexity
6. **Mental simulation** — Verify the optimized prompt would produce clearly better output

**STEP 5: CHANGE LOG**
List every change made in a numbered list:
- [ADDED] / [MODIFIED] / [REMOVED] / [RESTRUCTURED] — Description — [WHY: reason]

**STEP 6: BEFORE/AFTER PREDICTION**
- **Original prompt likely produces:** [Describe expected output quality and characteristics]
- **Optimized prompt likely produces:** [Describe expected output quality and characteristics]
- **Key improvement:** [The single biggest difference]

**STEP 7: FINAL SCORING**
Re-score the optimized prompt using the same rubric. Show the score improvement.

| Dimension | Before | After | Change |
|-----------|--------|-------|--------|
| ... | X/10 | X/10 | +X |

**Overall: [Before] → [After] (+[Improvement])**

---

**OUTPUT QUALITY STANDARDS:**
- Be specific and actionable — "add a role definition" not "make it clearer"
- Every criticism must come with a concrete fix
- The optimized prompt should be ready to use — not a suggestion, a complete rewrite
- If the original prompt is already strong (8+/10), focus on fine-tuning and edge cases
- If the original is weak (<5/10), the optimization may be a substantial rewrite — that's okay
- Never be condescending about the original — analyze professionally

Interactive Playground

🧪 Prompt Optimizer Playground

Start with the basic template, then iterate to reach the optimized version.

You are an expert prompt engineer. Analyze and optimize the following prompt using a rigorous evaluation framework.

**PROMPT TO OPTIMIZE:**
```
Write a blog post about AI.
```

**PROCESS:**
1. **Intent Detection** — What is the user trying to accomplish? Expected output type? Audience?

3. **Top 3 Weaknesses** — For each: Problem → Impact → Fix

4. **Anti-Pattern Check** — Vague instructions? Missing constraints? Assumed context? Conflicting rules?

5. **Optimized Prompt** — Complete rewrite preserving intent. Tag every change with [WHY].

6. **Change Log** — [ADDED/MODIFIED/REMOVED] — Description — [WHY]

7. **Before/After Prediction** — What each version would produce.

8. **Scoring Comparison** — Re-score optimized version, show improvement per dimension.

Be specific, actionable, and professional. Output a ready-to-use prompt, not suggestions.

Explanation

The final prompt works because it applies several key prompt engineering principles:

Meta-level reasoning — This prompt teaches the AI to think about prompts rather than simply executing them. This requires a different cognitive mode — evaluation rather than generation — and the structured framework guides that shift.
Quantified evaluation — The 10-dimension rubric with numerical scores forces systematic analysis rather than vague impressions. Numbers create accountability ("Specificity: 3/10" is much more actionable than "could be more specific").
Mandatory explanations — Requiring [WHY] tags for every change prevents arbitrary modifications. If the optimizer can't explain why a change improves things, it shouldn't make it.
Before/After prediction — Mental simulation of both prompts' outputs forces the optimizer to verify that changes actually improve results, not just look more professional.
Anti-pattern detection — Explicitly listing common prompt failures (vague instructions, assumed context, conflicting rules) gives the optimizer a checklist to catch issues a general analysis might miss.
Professional framing — The instruction "never be condescending about the original" ensures the output is useful feedback, not criticism — important when this tool is used by prompt learners.

Extensions & Challenges

Batch Optimizer — Modify the prompt to accept 5 prompts at once and produce a comparative analysis, ranking them from strongest to weakest with a unified improvement plan.
Domain-Specific Calibration — Create variants calibrated for specific prompt types: coding prompts, creative writing prompts, analysis prompts, system prompts — each with domain-specific rubric adjustments.
Iterative Optimization Loop — Feed the optimizer's output back into itself 3 times to see if quality improves with each pass or if there are diminishing returns.
Adversarial Testing — Add a step where the optimizer tries to find edge cases that would break the prompt (unusual inputs, ambiguous requests, boundary conditions) and then hardens the prompt against them.
Prompt Style Transfer — Build a variant that takes a working prompt in one style (e.g., casual and short) and transforms it into another style (e.g., formal and detailed) while preserving functionality.

Objective​

Requirements​

Difficulty​

Starter Template​

Step-by-Step Guide​

Step 1: Define the Optimizer Role​

Step 2: Build the Evaluation Framework​

Step 3: Create the Analysis Process​

Step 4: Build the Optimization Engine​

Step 5: Define the Output Report Format​

Final Optimized Prompt​

Interactive Playground​