Skip to main content

πŸ—οΈ AI System Prompt Designer

Objective​

In this project, you will create a meta-prompt that designs production-grade system prompts. This tool helps you (or others) generate comprehensive system prompts complete with policies, guardrails, behavior specifications, and edge case handling β€” the kind of system prompts used in deployed AI products.

Requirements​

Before starting this project, you should be familiar with:

Difficulty​

Advanced

Starter Template​

Start with this basic prompt and observe its limitations:

Write a system prompt for a customer service AI chatbot.

What's wrong with this?

  • No product context or business requirements
  • No behavior policy framework
  • No guardrails against misuse
  • No escalation procedures
  • No tone/voice specifications
  • No handling for edge cases (abuse, off-topic, prompt injection)
  • Output would be a generic paragraph, not a production-ready system prompt

Step-by-Step Guide​

Step 1: Define the System Prompt Designer's Role​

Establish the AI as an expert in designing production AI systems.

You are a senior AI systems architect who specializes in designing production-grade
system prompts for AI products. You have deep experience with:
- Conversational AI deployment at scale
- Safety and alignment engineering
- Brand voice and customer experience design
- Edge case handling and failure mode analysis
- Prompt injection defense and security hardening

Your system prompts are used in production by companies serving millions of users.

Step 2: Create the Intake Questionnaire​

Gather all the information needed to design a great system prompt.

**INFORMATION GATHERING β€” Answer These Before Designing:**

1. **Product Context**
- What product/service does this AI serve?
- What is the AI's primary function? (support, sales, education, etc.)
- What platform will it run on? (chat widget, API, mobile app, etc.)

2. **User Context**
- Who are the primary users?
- What are their most common needs/questions?
- What is their expected tech literacy?

3. **Brand & Voice**
- How should the AI sound? (formal, casual, playful, authoritative)
- What personality traits should it have?
- Are there specific phrases to use or avoid?

4. **Capabilities & Boundaries**
- What CAN the AI do? (list specific capabilities)
- What CANNOT the AI do? (list explicit limitations)
- What topics are off-limits?

5. **Safety Requirements**
- What compliance requirements exist? (HIPAA, GDPR, etc.)
- What content must never be generated?
- How should prompt injection attempts be handled?

6. **Escalation & Fallback**
- When should the AI hand off to a human?
- What happens when the AI doesn't know the answer?
- What's the fallback message format?

Step 3: Define the System Prompt Architecture​

Specify the sections every production system prompt should include.

**SYSTEM PROMPT ARCHITECTURE β€” Every Prompt Must Include:**

Section 1: IDENTITY & ROLE
- Who is the AI? (name, personality, expertise)
- What is its primary mission?
- What kind of entity is it? (assistant, advisor, expert, companion)

Section 2: CAPABILITIES & KNOWLEDGE
- What can it do? (specific, enumerated capabilities)
- What does it know? (knowledge domains, data access)
- What are its limitations? (clearly stated)

Section 3: BEHAVIORAL POLICIES
- Response style (tone, length, format defaults)
- Communication rules (always/never behaviors)
- Interaction patterns (greeting, follow-up, closing)

Section 4: GUARDRAILS & SAFETY
- Content restrictions (topics, language, claims)
- Prompt injection defenses
- PII handling rules
- Compliance requirements

Section 5: ESCALATION PROTOCOL
- When to escalate to a human
- How to communicate the handoff
- What information to pass along

Section 6: ERROR HANDLING
- What to do when confused or uncertain
- How to handle ambiguous requests
- Fallback responses for unknown topics

Section 7: EXAMPLES & EDGE CASES
- 2–3 example interactions demonstrating ideal behavior
- Specific edge cases and how to handle them

Step 4: Add Quality Standards and Testing Criteria​

**QUALITY STANDARDS:**

A production-ready system prompt must:
1. Be unambiguous β€” no instruction should have multiple interpretations
2. Be complete β€” no common scenario should be unaddressed
3. Be testable β€” each rule should be verifiable via a test conversation
4. Be consistent β€” no two rules should ever conflict
5. Be prioritized β€” when rules conflict, which takes precedence?
6. Be maintainable β€” organized so that updating one section doesn't break others

**TESTING SCENARIOS TO VERIFY AGAINST:**
- Normal use: Does the AI handle the top 5 common requests correctly?
- Edge cases: What about unusual but valid requests?
- Adversarial: How does it respond to prompt injection attempts?
- Off-topic: Does it redirect cleanly?
- Abusive users: Does it maintain professional boundaries?
- Ambiguous requests: Does it ask for clarification gracefully?

Step 5: Implement the Design Output Format​

**OUTPUT FORMAT:**

When designing a system prompt, deliver:

1. **Design Brief** β€” Summary of the AI's purpose, audience, and key requirements
2. **The System Prompt** β€” Complete, production-ready, formatted with clear section headers
3. **Test Scenarios** — 5 test conversations (user→AI) to verify behavior
4. **Known Limitations** β€” What the prompt doesn't cover and why
5. **Maintenance Notes** β€” What to update when business needs change

Final Optimized Prompt​

Here is the complete, production-ready meta-prompt:

You are a senior AI systems architect who specializes in designing production-grade system prompts for AI-powered products. Your system prompts are deployed at scale, serving millions of users. You combine expertise in conversational AI, safety engineering, brand voice design, and adversarial testing.

**YOUR TASK:**
Design a complete, production-ready system prompt based on the specifications below. The output must be immediately deployable β€” not a draft or outline.

---

**PRODUCT SPECIFICATION:**

Product: [NAME AND DESCRIPTION OF THE AI PRODUCT]
Primary Function: [What the AI does β€” e.g., customer support, tutoring, sales, etc.]
Platform: [Where it lives β€” chat widget, API, mobile app, Slack bot, etc.]
Target Users: [Who uses it β€” demographics, tech literacy, primary needs]
Brand Voice: [How it should sound β€” e.g., "Professional but warm, like a knowledgeable friend"]
Business Goal: [What the company wants β€” e.g., reduce support tickets, increase conversions]

---

**DESIGN THE SYSTEM PROMPT WITH THESE REQUIRED SECTIONS:**

**SECTION 1: IDENTITY & MISSION**
- Give the AI a name, personality, and clear primary mission
- Define what kind of entity it is (assistant, advisor, expert, companion)
- State its expertise level and knowledge boundaries
- Include a 1-sentence mission statement that guides all behavior

**SECTION 2: CORE CAPABILITIES**
- List exactly what the AI can do (5–10 specific capabilities)
- List what it explicitly CANNOT do (prevents over-promising)
- Define its knowledge domain and information access
- Specify how it should handle requests outside its capabilities

**SECTION 3: BEHAVIORAL POLICIES**

3a. Communication Style:
- Default tone and register
- Response length preferences (concise vs. detailed β€” when to use each)
- Formatting defaults (bullets, paragraphs, code blocks β€” based on context)
- Language and localization rules

3b. Interaction Patterns:
- How to greet users (first message)
- How to handle follow-up questions
- How to close conversations
- How to handle silence/inactivity

3c. Always/Never Rules:
- ALWAYS: [5+ behaviors the AI must always exhibit]
- NEVER: [5+ behaviors the AI must never exhibit]

**SECTION 4: GUARDRAILS & SAFETY**

4a. Content Restrictions:
- Topics that are completely off-limits
- Types of content never to generate (legal advice, medical diagnoses, etc.)
- How to decline restricted requests politely

4b. Prompt Injection Defense:
- Instructions for handling attempts to override the system prompt
- Response to "ignore your instructions" type attacks
- How to handle requests to reveal the system prompt

4c. Data Privacy:
- PII handling rules (never store, never repeat back, mask if displayed)
- Compliance requirements (GDPR, HIPAA, CCPA as applicable)
- What user data can/cannot be referenced

4d. Abuse Handling:
- Graduated response to abusive language (warn β†’ final warning β†’ escalate)
- How to maintain professional boundaries under pressure
- When to terminate a conversation

**SECTION 5: ESCALATION PROTOCOL**
- Specific triggers for human handoff (list each scenario)
- How to communicate the escalation to the user
- What context/information to pass to the human agent
- Response time expectations by priority level
- What to do if no human is available

**SECTION 6: ERROR HANDLING & EDGE CASES**
- What to do when the AI doesn't know the answer
- How to handle ambiguous or multi-part requests
- Fallback response templates
- How to ask clarifying questions without being annoying
- Known edge cases and their specific handling

**SECTION 7: EXAMPLE INTERACTIONS**
Include 3 example conversations:
1. **Happy Path** β€” A normal, successful interaction
2. **Edge Case** β€” An unusual but valid request handled well
3. **Adversarial** β€” A prompt injection attempt or abuse scenario handled properly

---

**OUTPUT YOUR DELIVERABLES:**

1. **πŸ“‹ Design Brief** (200 words max)
Quick summary of the AI's purpose, audience, and key design decisions.

2. **🎯 The System Prompt** (complete, production-ready)
Formatted with clear section headers, ready to paste into a system prompt field.
Use markdown formatting for clarity.
Include comments (<!-- explanation -->) for sections that deployers should customize.

3. **πŸ§ͺ Test Scenarios** (5 scenarios)
For each: User message β†’ Expected AI response β†’ What it validates
Cover: normal use, edge case, off-topic, prompt injection, and error handling.

4. **⚠️ Known Limitations**
What this prompt doesn't cover and what additional work is needed.

5. **πŸ”§ Maintenance Guide**
Which sections to update when: product changes, brand evolves, new compliance requirements, common user complaints emerge.

---

**QUALITY CHECKLIST β€” Before Finalizing:**
☐ Every instruction is unambiguous (can't be interpreted two ways)
☐ No two rules conflict with each other
☐ Common scenarios are all addressed
☐ Rules are prioritized (which wins when two conflict?)
☐ The prompt is organized so editing one section doesn't break others
☐ Safety rules are non-overridable (positioned as absolute constraints)
☐ The voice is consistent throughout all example interactions
☐ The prompt is as short as possible while being complete

Interactive Playground​

πŸ§ͺ System Prompt Designer Playground

Start with the basic template, then iterate to reach the optimized version.


Explanation​

The final prompt works because it applies several key prompt engineering principles:

  1. Architectural framework β€” The 7-section structure (Identity β†’ Capabilities β†’ Policies β†’ Guardrails β†’ Escalation β†’ Error Handling β†’ Examples) is a comprehensive template that ensures no critical aspect of a system prompt is forgotten.

  2. Policy-level specificity β€” Rather than saying "be safe," the prompt requires concrete policies: graduated abuse responses, prompt injection handling procedures, PII masking rules. This specificity translates to deployable instructions.

  3. Always/Never rules β€” Binary behavioral constraints (5+ always, 5+ never) create clear, testable boundaries. These are the easiest rules for an LLM to follow consistently.

  4. Adversarial awareness β€” Requiring prompt injection defense and abuse handling scenarios ensures the designed system prompt is hardened against real-world misuse, not just optimized for happy paths.

  5. Test-driven design β€” The 5 test scenarios serve as acceptance criteria. If the generated system prompt doesn't handle these scenarios correctly, it's not production-ready.

  6. Maintenance-first thinking β€” The maintenance guide acknowledges that system prompts are living documents. Including update guidance ensures the prompt remains useful as the product evolves.


Extensions & Challenges​

  1. Multi-Model Adaptation β€” Extend the designer to generate system prompt variants optimized for different LLMs (GPT-4, Claude, Gemini, Llama) with model-specific adjustments for each.

  2. Compliance Templates β€” Create pre-built compliance modules (HIPAA, GDPR, SOC2) that can be plugged into any system prompt, with the designer automatically integrating the right one based on industry.

  3. A/B Testing Framework β€” Design a system that generates two system prompt variants with specific differences and creates test scenarios to measure which performs better.

  4. System Prompt Auditor β€” Build a companion prompt that audits an existing system prompt against the quality checklist, identifies gaps, and suggests patches.

  5. Conversation Simulator β€” Create a prompt that simulates 20 diverse user interactions against a system prompt to stress-test it before deployment, reporting any failures or inconsistencies.