System vs user messages

When you use the Claude API, every request carries a system parameter and a messages array. Most beginners dump everything into the user message. This lesson explains why the split matters and how to use it correctly.

What goes where

The system message defines who the model is for the duration of the conversation. It sets persona, rules, constraints, and output format expectations. Think of it as the model's job description.

The user message defines what the model should do right now. It carries the specific task, the input data, and any per-request instructions.

Here is the principle: anything that stays the same across multiple requests belongs in the system message. Anything that changes per request belongs in the user message.

A concrete example

Suppose you are building a code review assistant. Here is the wrong way:

# Anti-pattern: everything in the user message
message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": """You are a senior software engineer who reviews code.
Be concise. Focus on bugs, not style. Never suggest rewriting
from scratch. Use this format: file:line — issue — fix.

Review this diff:
- def calc(x): return x * 2.0
+ def calc(x): return x * 2"""
        }
    ]
)

And here is the correct way:

# Correct: persistent behavior in system, task in user
message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    system="""You are a senior software engineer who reviews code.
Be concise. Focus on bugs, not style nits.
Never suggest rewriting from scratch.
Format each finding as: file:line — issue — suggested fix.""",
    messages=[
        {
            "role": "user",
            "content": """Review this diff:
- def calc(x): return x * 2.0
+ def calc(x): return x * 2"""
        }
    ]
)

The behavior is identical for one request. The difference shows up at scale.

Why the split matters

Reusability. The system message stays the same for every code review. The user message changes with every diff. You write the persona once and reuse it across thousands of requests.

Prompt caching. Anthropic's API supports prompt caching. When the system message is identical across requests, Claude can cache the processed system prompt and skip reprocessing it. This reduces latency and cost. If you mix rules into the user message, the cache key changes every time and you get zero benefit.

Clarity of intent. Separating "who you are" from "what to do now" makes prompts easier to read, debug, and version-control. When output quality degrades, you know to look at either the system prompt (behavior changed) or the user message (task changed), not both.

Multi-turn stability. In a multi-turn conversation, the system message persists automatically across every turn. Rules placed in the first user message can get pushed out of the model's attention window as the conversation grows. The system message stays anchored.

Anti-patterns to avoid

1. Rules in the user message. If you put "always respond in JSON" in the user message, the model may forget it after a few turns. Put it in the system message.

2. The 2,000-word system prompt. A system message that tries to cover every possible edge case often contradicts itself. Keep it focused: persona, 3-5 rules, output format. If you need extensive context, put the reference material in the user message with clear delimiters.

3. Duplicating instructions. Saying "be concise" in both the system and user messages does not make the model twice as concise. It wastes tokens and can confuse priority.

4. No system message at all. Relying on the default behavior means relying on whatever the model's base persona decides. For production, always set an explicit system message — even a short one.

The TypeScript pattern

If you are using the Anthropic TypeScript SDK, the structure is the same:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const response = await client.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 512,
  system: `You are a customer support agent for a developer tools company.
Respond in 1-3 sentences. If you don't know the answer, say so.
Never make up product features.`,
  messages: [
    {
      role: "user",
      content: "Does your API support batch processing?",
    },
  ],
});

The system message is your single source of truth for behavior. The user message is disposable — it changes with every request.

A rule of thumb

Before sending any prompt, ask yourself: "If I sent a completely different task with the same system message, would the system message still make sense?" If yes, the split is correct. If the system message contains task-specific data, move that data to the user message.

What's next

Now that you know where each part of a prompt goes, the next lesson — Few-shot examples — teaches the most powerful single technique for improving output quality: showing the model what you want instead of telling it.

حين تستعمل واجهة Claude البرمجيّة، كلّ طلب يحمل معامل system ومصفوفة messages. أغلب المبتدئين يضعون كلّ شيء في رسالة المستخدم. هذا الدّرس يشرح أهمّيّة التّقسيم وكيفيّة استعماله بشكل صحيح.

ماذا يذهب أين

رسالة النّظام تحدّد من يكون النّموذج طوال المحادثة. تضبط الشّخصيّة والقواعد والقيود وتوقّعات شكل المخرج. اعتبرها الوصف الوظيفي للنّموذج.

رسالة المستخدم تحدّد ماذا يفعل النّموذج الآن. تحمل المهمّة المحدّدة وبيانات الإدخال وأيّ تعليمات خاصّة بالطّلب.

المبدأ: كلّ ما يبقى ثابتًا عبر طلبات متعدّدة يذهب إلى رسالة النّظام. كلّ ما يتغيّر بين الطّلبات يذهب إلى رسالة المستخدم.

لماذا يهمّ التّقسيم

إعادة الاستعمال. رسالة النّظام تبقى ذاتها لكلّ مراجعة كود. رسالة المستخدم تتغيّر مع كلّ فرق. تكتب الشّخصيّة مرّة وتعيد استعمالها عبر آلاف الطّلبات.

التّخزين المؤقّت. واجهة Anthropic البرمجيّة تدعم تخزين المطالبات مؤقّتًا. حين تتطابق رسالة النّظام بين الطّلبات، يستطيع Claude تخزينها وتخطّي إعادة معالجتها. هذا يقلّل التّأخير والتّكلفة.

وضوح القصد. فصل "من أنت" عن "ماذا تفعل الآن" يجعل المطالبات أسهل للقراءة والتّنقيح والتّحكّم بالإصدارات.

استقرار المحادثات المتعدّدة. رسالة النّظام تستمرّ تلقائيًّا عبر كلّ دور. القواعد في رسالة المستخدم الأولى قد تخرج من نافذة انتباه النّموذج مع نموّ المحادثة.

أنماط مضادّة يجب تجنّبها

1. القواعد في رسالة المستخدم. إذا وضعت "أجب دائمًا بـ JSON" في رسالة المستخدم، قد ينساها النّموذج بعد أدوار قليلة.

2. رسالة نظام من ألفي كلمة. رسالة نظام تحاول تغطية كلّ حالة حافّة غالبًا تناقض نفسها. اجعلها مركّزة: شخصيّة، ثلاث إلى خمس قواعد، شكل المخرج.

3. تكرار التّعليمات. قول "كن موجزًا" في كلتا الرّسالتين لا يجعل النّموذج أكثر إيجازًا مرّتين. إنّه يهدر الرّموز.

4. لا رسالة نظام أصلًا. الاعتماد على السّلوك الافتراضي يعني الاعتماد على ما تقرّره شخصيّة النّموذج الأساسيّة. للإنتاج، اضبط دائمًا رسالة نظام صريحة.

قاعدة عامّة

قبل إرسال أيّ مطالبة، اسأل نفسك: "لو أرسلت مهمّة مختلفة تمامًا مع نفس رسالة النّظام، هل تبقى رسالة النّظام منطقيّة؟" إن نعم، التّقسيم صحيح.

ما التّالي

الآن وقد عرفت أين يذهب كلّ جزء من المطالبة، الدّرس القادم — الأمثلة القليلة — يعلّمك أقوى تقنيّة منفردة لتحسين جودة المخرج: أن تُري النّموذج ما تريده بدل أن تخبره.

What goes where

The system message defines who the model is for the duration of the conversation. It sets persona, rules, constraints, and output format expectations. Think of it as the model's job description.

The user message defines what the model should do right now. It carries the specific task, the input data, and any per-request instructions.

Here is the principle: anything that stays the same across multiple requests belongs in the system message. Anything that changes per request belongs in the user message.

A concrete example

Suppose you are building a code review assistant. Here is the wrong way:

# Anti-pattern: everything in the user message
message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": """You are a senior software engineer who reviews code.
Be concise. Focus on bugs, not style. Never suggest rewriting
from scratch. Use this format: file:line — issue — fix.

Review this diff:
- def calc(x): return x * 2.0
+ def calc(x): return x * 2"""
        }
    ]
)

And here is the correct way:

# Correct: persistent behavior in system, task in user
message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    system="""You are a senior software engineer who reviews code.
Be concise. Focus on bugs, not style nits.
Never suggest rewriting from scratch.
Format each finding as: file:line — issue — suggested fix.""",
    messages=[
        {
            "role": "user",
            "content": """Review this diff:
- def calc(x): return x * 2.0
+ def calc(x): return x * 2"""
        }
    ]
)

The behavior is identical for one request. The difference shows up at scale.

Why the split matters

Reusability. The system message stays the same for every code review. The user message changes with every diff. You write the persona once and reuse it across thousands of requests.

Anti-patterns to avoid

1. Rules in the user message. If you put "always respond in JSON" in the user message, the model may forget it after a few turns. Put it in the system message.

3. Duplicating instructions. Saying "be concise" in both the system and user messages does not make the model twice as concise. It wastes tokens and can confuse priority.

The TypeScript pattern

If you are using the Anthropic TypeScript SDK, the structure is the same:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const response = await client.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 512,
  system: `You are a customer support agent for a developer tools company.
Respond in 1-3 sentences. If you don't know the answer, say so.
Never make up product features.`,
  messages: [
    {
      role: "user",
      content: "Does your API support batch processing?",
    },
  ],
});

The system message is your single source of truth for behavior. The user message is disposable — it changes with every request.

System vs user messages

What you'll learn

What goes where

A concrete example

Why the split matters

Anti-patterns to avoid

The TypeScript pattern

A rule of thumb

What's next

ماذا يذهب أين

لماذا يهمّ التّقسيم

أنماط مضادّة يجب تجنّبها

قاعدة عامّة

ما التّالي

Try it yourself

Reflect

System vs user messages

What you'll learn

What goes where

A concrete example

Why the split matters

Anti-patterns to avoid

The TypeScript pattern

A rule of thumb

What's next

ماذا يذهب أين

لماذا يهمّ التّقسيم

أنماط مضادّة يجب تجنّبها

قاعدة عامّة

ما التّالي

Try it yourself

Reflect