🎯 Few-Shot Prompting

Few-shot prompting means providing the AI with 2-5 examples of the task you want it to perform before giving it the actual task. You're teaching the AI a pattern through demonstrations so it can apply that pattern to new inputs. It's the AI equivalent of "learn by example."

The term comes from machine learning: "few-shot" means learning from a few examples, as opposed to "zero-shot" (no examples) or "many-shot" (lots of examples).

Why This Matters

Few-shot prompting is one of the most reliable techniques in prompt engineering. It works because:

Patterns are unambiguous — examples leave no room for misinterpretation
Complex formats are easy to show — instead of describing a format in words, you demonstrate it
Consistency is guaranteed — the AI matches the pattern of your examples
Edge cases are handled — you can show the AI how to deal with tricky inputs

Studies show that few-shot prompting can improve accuracy by 20-50% compared to zero-shot prompting on tasks like classification, extraction, and formatting.

Few-Shot vs. Zero-Shot

Aspect	Zero-Shot	Few-Shot
Examples given	None	2-5
Best for	Simple, well-known tasks	Custom formats, classification, nuanced tasks
Accuracy	Good for common tasks	Better for specific/unusual tasks
Prompt length	Shorter	Longer (examples use tokens)
When to choose	Task is standard	Task is custom or the AI keeps getting it wrong

Use zero-shot when the task is straightforward: "Translate this to Spanish," "Summarize this article."

Use few-shot when you need a specific pattern, custom classification, or the zero-shot output isn't right.

How Many Examples?

The golden rule: 2-3 examples for simple tasks, 4-5 for complex ones.

Examples	Trade-off
1	Might not establish a pattern (could be coincidence)
2-3	Sweet spot for most tasks — clear pattern, reasonable token usage
4-5	For complex or nuanced tasks — handles edge cases
6+	Diminishing returns, wastes tokens, rarely needed

Crafting Effective Few-Shot Examples

Rule 1: Examples Must Be Consistent

Every example should follow the exact same format:

✅ Good — consistent format:
Input: "The movie was amazing!" → Sentiment: Positive
Input: "Waste of time and money" → Sentiment: Negative
Input: "It was okay, nothing special" → Sentiment: Neutral

❌ Bad — inconsistent format:
"The movie was amazing!" — that's positive
Negative: "Waste of time and money"
Input: "It was okay" => Neutral

Rule 2: Cover Different Cases

Show variety to teach the AI to handle different scenarios:

Email: "Hey, can we push the meeting to 3pm?" → Category: Schedule Change
Email: "Your invoice #4521 is attached" → Category: Billing
Email: "The login page returns a 500 error" → Category: Bug Report
Email: "Can you add dark mode to the app?" → Category: Feature Request

Rule 3: Include an Edge Case

Show the AI how to handle tricky or ambiguous inputs:

Text: "I love the speed but hate the battery life" → Sentiment: Mixed
Text: "Delivered on time" → Sentiment: Neutral (factual, not opinion)

Rule 4: Keep Examples Realistic

Use real-world data that resembles your actual inputs, not toy examples.

Prompt Example

Classify customer support emails into one of these categories:
Billing, Technical, Account, Feedback, or Other.

Examples:

Email: "I was charged twice for my subscription this month."
Category: Billing

Email: "The app crashes every time I try to upload a photo."
Category: Technical

Email: "How do I change the email address on my account?"
Category: Account

Email: "I really love the new dashboard design!"
Category: Feedback

Email: "What are your office hours?"
Category: Other

---

Now classify these emails:

Email 1: "I need a refund for my last payment."
Email 2: "The search feature is returning incorrect results."
Email 3: "Can I merge my two accounts into one?"
Email 4: "Your customer service is exceptional — thank you!"
Email 5: "I can't connect the app to my Bluetooth device."

❌ Bad Example

Classify these customer emails into categories

What categories? What format? Should you just name the category, or explain why? Without examples, the AI invents its own categories and format, which probably won't match your system.

✅ Improved Example

Classify customer emails into: Billing, Technical, Account, or General.
Return format: "Email: [first 10 words...] → [Category]"

Examples:
Email: "I was charged twice for my subscription" → Billing
Email: "App crashes when I click the settings button" → Technical
Email: "How do I reset my password?" → Account
Email: "What are your business hours?" → General

Now classify:
Email: "The payment on my last invoice seems incorrect"
Email: "I can't log into my account since yesterday"
Email: "The export feature generates corrupted CSV files"
Email: "Do you offer student discounts?"
Email: "I want to update my credit card information"

🧪 Try It Yourself

Edit the prompt and click Run to see the AI response.

Practice Challenge

Task: Build a few-shot prompt that teaches the AI to convert casual text into professional business language.

Create 4 examples that show the transformation:
- Casual input → Professional output
- Include different scenarios (email, message, feedback, request)
Then give the AI 3 new casual texts to convert
Evaluate: Does the AI maintain the same level of formality across all three? Does it match the pattern of your examples?

Bonus: Try the same task with just 1 example (one-shot) vs. 4 examples (few-shot). Compare the consistency of the outputs.

Real-World Scenario

Scenario: You're building a data pipeline that extracts structured information from unstructured product descriptions.

Without few-shot examples:

"Extract product details from this description: 'The Sony WH-1000XM5 headphones feature 30-hour battery life, industry-leading noise cancellation, and come in black or silver for $349.99'"

The AI returns a random format that doesn't match your database schema.

With few-shot examples:

"Extract product details from descriptions into this exact format:

Description: 'Apple AirPods Pro 2 with adaptive noise cancellation, 6hr battery, available in white, priced at $249.' Result: {brand: 'Apple', product: 'AirPods Pro 2', feature: 'adaptive noise cancellation', battery: '6hr', colors: ['white'], price: 249.00}

Description: 'Samsung Galaxy Buds2 Pro with intelligent ANC and 5 hours playback. Available in graphite, white, and bora purple. $199.99.' Result: {brand: 'Samsung', product: 'Galaxy Buds2 Pro', feature: 'intelligent ANC', battery: '5hr', colors: ['graphite', 'white', 'bora purple'], price: 199.99}

Now extract from: Description: 'The Sony WH-1000XM5 headphones feature 30-hour battery life, industry-leading noise cancellation, and come in black or silver for $349.99'"

The AI now follows your exact schema because you demonstrated the pattern.

When Few-Shot Beats Zero-Shot

Few-shot prompting is clearly better when:

The output format is custom (not a standard format)
The task involves classification into your specific categories
The style or tone needs to match a specific pattern
Zero-shot keeps getting it wrong — examples fix it
You're doing data extraction or transformation into a specific schema

Interview Question

Q: What is few-shot prompting, how does it differ from zero-shot prompting, and when should you use each?

A: Few-shot prompting provides 2-5 examples of the desired input-output pattern before the actual task, teaching the AI through demonstration. Zero-shot prompting gives instructions without any examples, relying on the model's pre-trained knowledge. Few-shot is preferred when: (1) the task requires a custom format or classification scheme, (2) zero-shot output doesn't match expectations, (3) the task involves nuanced categorization or data transformation, or (4) consistency across multiple outputs is critical. Zero-shot is sufficient for common, well-defined tasks like translation or summarization. Best practices for few-shot: use 2-5 consistent examples, cover different cases including edge cases, and use realistic data that resembles actual inputs. Research shows few-shot can improve accuracy by 20-50% on classification and extraction tasks.

Summary

Few-shot prompting provides 2-5 examples before the actual task to teach the AI a pattern
It improves accuracy by 20-50% on classification, extraction, and custom formatting tasks
Use 2-3 examples for simple tasks, 4-5 for complex ones
Examples must be consistent in format — identical structure across all examples
Cover different cases including edge cases and ambiguous inputs
Use realistic data that resembles your actual inputs
Few-shot beats zero-shot when the task is custom, nuanced, or involves specific categories
Zero-shot is fine for well-known tasks like translation, summarization, or simple questions
Each example uses tokens, so balance quality with prompt length

Why This Matters​

Few-Shot vs. Zero-Shot​

How Many Examples?​

Crafting Effective Few-Shot Examples​

Rule 1: Examples Must Be Consistent​

Rule 2: Cover Different Cases​

Rule 3: Include an Edge Case​

Rule 4: Keep Examples Realistic​

Prompt Example​

❌ Bad Example​

✅ Improved Example​