Concept #17Mediumgen-ai-fundamentals

Design a prompt for sentiment analysis. What could go wrong?

#gen-ai#prompt-engineering

Answer

Designing Effective Prompts for Sentiment Analysis

Sentiment analysis is a common first task when deploying LLMs. Here's how to design robust prompts and handle the failure modes.

Basic Prompt Design

python
from openai import OpenAI
from enum import Enum
import json

client = OpenAI()

class Sentiment(str, Enum):
    POSITIVE = "positive"
    NEGATIVE = "negative"
    NEUTRAL = "neutral"
    MIXED = "mixed"

SYSTEM_PROMPT = '''You are a sentiment analysis expert for an e-commerce platform.

Classify the sentiment of customer reviews. Consider:
- Overall tone, not just individual words
- Sarcasm and irony (e.g. "Oh great, another broken product" = Negative)
- Mixed sentiments (praise one aspect, criticise another = Mixed)

Respond ONLY with valid JSON matching this schema:
{"sentiment": "positive|negative|neutral|mixed", "confidence": 0.0-1.0, "reasoning": "brief explanation"}'''

def analyze_sentiment(review: str) -> dict:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": f"Review: {review}"}
        ],
        temperature=0,
        response_format={"type": "json_object"}
    )
    return json.loads(response.choices[0].message.content)

# Test
result = analyze_sentiment("Product is amazing but delivery took 3 weeks — unacceptable!")
print(result)
# {"sentiment": "mixed", "confidence": 0.95, "reasoning": "Positive product quality, negative delivery experience"}

Handling Failure Modes

Failure ModeExampleFix
Sarcasm misclassified"Oh great, another defect 🙄" → PositiveAdd sarcasm instruction + examples
Domain-specific terms"This knife has terrible flex" (flex = good for bakers)Add domain context to system prompt
Mixed sentiment collapsed"Love the product, hate the price" → PositiveExplicitly define Mixed class
JSON parsing failureLLM outputs extra textUse
text
response_format=json_object
+ try/except
Multilingual inputFrench review misclassifiedAdd: "Reviews may be in any language"
Emoji-heavy reviews"😍😍😍"Include emoji examples in few-shot

Production-Ready Implementation

python
import json
from typing import Optional

def safe_analyze_sentiment(review: str, fallback: Optional[str] = None) -> dict:
    try:
        result = analyze_sentiment(review)
        # Validate response schema
        assert result["sentiment"] in ["positive", "negative", "neutral", "mixed"]
        assert 0.0 <= result["confidence"] <= 1.0
        return result
    except (json.JSONDecodeError, KeyError, AssertionError) as e:
        # Retry with simplified prompt
        simplified = client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "user",
                "content": f"Rate this review as positive, negative, or neutral ONLY: {review}"
            }],
            temperature=0
        )
        sentiment = simplified.choices[0].message.content.strip().lower()
        return {"sentiment": sentiment, "confidence": 0.7, "reasoning": "Simplified fallback"}

Prompt Engineering Best Practices for Classification

  • Be explicit about edge cases (sarcasm, mixed sentiment, emojis)
  • Define your classes precisely — what separates Neutral from Mixed?
  • Use structured output (JSON) to prevent parsing errors
  • Set
    text
    temperature=0
    for deterministic classification
  • Include few-shot examples for ambiguous cases
  • Version your prompts — small changes can significantly affect accuracy

Key lesson: Always validate and sanitise user input before embedding it in a prompt. A user can inject "Ignore all previous instructions" — treat user content as untrusted data, not trusted instructions.