في هذه الصفحة

الدّرس 4 من 6

درجة الحرارة وأسلوب الاختيار

18 دقائق قراءةالصّعوبة

ما ستتعلّمه

تعريف درجة الحرارة بجملة وتوقّع أثرها
اختيار درجة حرارة لمهمّة حقيقيّة لا بالحدس
تركيب درجة الحرارة مع أسلوب top-p بشكل سليم

Temperature is the most misused knob in AI. Most people leave it at whatever default their tool provides. After this lesson, you will know exactly when to crank it up and when to pin it to zero — and why.

What temperature actually does

Remember from Lesson 1: for each position in the output, the model produces a probability distribution over all possible next tokens. Temperature modifies that distribution before a token is selected.

Here is the intuition. Imagine the model is deciding the next word and the top candidates are:

"Paris" — 70% probability
"Lyon" — 15% probability
"Berlin" — 8% probability
"Tokyo" — 4% probability
everything else — 3%

At temperature 0 (greedy decoding), the model always picks "Paris." Every time, no variation. Run the same prompt a hundred times, get the same output a hundred times.

At temperature 0.3, the distribution is sharpened but not collapsed. "Paris" still wins almost every time, but occasionally "Lyon" sneaks through. The output is mostly predictable with small variations.

At temperature 0.7, the distribution is flattened. "Paris" is still the most likely, but "Lyon," "Berlin," and even "Tokyo" get a real shot. The output becomes noticeably more varied and creative.

At temperature 1.0, the distribution is used as-is from the model. All candidates get sampled according to their raw probabilities. The output is diverse, sometimes surprising, and occasionally incoherent.

Above 1.0, the distribution is flattened even further. Low-probability tokens get boosted. The output becomes increasingly random and often nonsensical. Most practitioners never go above 1.0.

Technically, temperature divides the logits (raw scores) before the softmax function converts them to probabilities. Lower temperature makes the high-probability tokens even more dominant. Higher temperature makes the distribution more uniform. You do not need to remember the math — just the effect.

The sweet spots

Here is a practical guide based on real-world usage patterns:

Temperature 0 — Deterministic and safe. Use for: data extraction, classification, code generation, structured output (JSON/XML), any task where consistency matters more than creativity. When you need the same input to produce the same output every time, temperature 0 is the answer.

Temperature 0.3 to 0.5 — The workhorse range. Use for: most professional writing, explanations, analysis, email drafts, technical documentation. You get reliable quality with just enough variation to avoid robotic repetition. This is where most day-to-day AI work lives.

Temperature 0.7 to 1.0 — Creative territory. Use for: brainstorming, creative writing, generating alternative approaches, ideation sessions, marketing copy where you want flair. Higher temperatures give you diversity, which is valuable when you are exploring possibilities rather than seeking one correct answer.

The practical rule: start at 0.3 and adjust based on results. If the output is too repetitive or boring, raise the temperature. If it is too wild or makes factual errors, lower it.

Top-p: the companion control

Top-p sampling (also called nucleus sampling) is a different approach to the same problem. Instead of modifying the entire distribution, top-p sets a cumulative probability threshold. If you set top-p to 0.9, the model only considers the smallest set of tokens whose probabilities add up to 90%, and ignores everything else.

Using the earlier example with top-p at 0.9: "Paris" (70%) + "Lyon" (15%) + "Berlin" (8%) = 93%. So the model would choose among Paris, Lyon, and Berlin, ignoring Tokyo and everything else.

Top-p is useful because it adapts to context. When the model is very confident (one token dominates), top-p naturally narrows the choices. When the model is uncertain (probabilities are spread out), top-p allows more options. This is more adaptive than temperature, which applies the same modification regardless of context.

How to combine them. Most APIs let you set both temperature and top-p. The standard advice:

If you adjust temperature, leave top-p at its default (usually 1.0)
If you prefer top-p control, leave temperature at 1.0 and adjust top-p
Adjusting both simultaneously is valid but harder to reason about — one knob at a time is cleaner

Common mistakes

Mistake 1: Using high temperature for factual tasks. If you ask "What is the boiling point of water?" at temperature 1.0, the model might occasionally say 99 or 101 degrees instead of 100 because those tokens get boosted. For factual work, lower temperature protects accuracy.

Mistake 2: Using temperature 0 for creative tasks. You will get the same output every time. That is the opposite of creative exploration. If you are brainstorming, you want diversity.

Mistake 3: Thinking temperature fixes bad prompts. If the model gives poor output, changing the temperature rarely helps. The problem is usually in the prompt, not the sampling. Fix the prompt first, then tune the temperature.

Mistake 4: Ignoring temperature entirely. Many chat interfaces hide temperature behind defaults. But if you use APIs or developer tools, explicitly setting temperature for each use case is one of the highest-leverage moves you can make.

Temperature in practice

When you use Claude through the chat interface, you are typically using a default temperature set by Anthropic. When you use the API, you control it directly. Most programming frameworks default to temperature 1.0, which is too high for many professional use cases. Always set it explicitly.

A simple decision process: ask yourself, "Do I want the model to be creative or correct?" The more you need correctness and consistency, the lower your temperature. The more you need novelty and variety, the higher.

What is next

You now understand the two fundamental controls: what the model sees (context window) and how it picks tokens (temperature and sampling). The next lesson zooms out to a bigger question: when you want to change the model's behavior, should you adjust your prompt, add your own data, or retrain the model? The answer is almost always simpler than you think.

درجة الحرارة: مقبض الإبداع

درجة الحرارة (temperature) تتحكّم بمدى عشوائيّة اختيار الرّمز التّالي. عند الصّفر يختار النّموذج دائمًا الرّمز الأكثر احتمالاً — نتيجة حتميّة ومتّسقة لكنّها رتيبة. عند 1.0 تُستعمل الاحتمالات كما هي فيتنوّع المخرج لكنّه قد يشرد. المنطقة المثاليّة لأغلب الأعمال المهنيّة هي 0.3 إلى 0.5: جودة موثوقة مع تنوّع كافٍ لتجنّب الرّتابة. استعمل الصّفر لاستخراج البيانات والكود، و0.7 إلى 1.0 للكتابة الإبداعيّة والعصف الذّهنيّ.

أمّا Top-p فهو أداة مرافقة تحدّد عتبة احتماليّة تراكميّة: عند 0.9 مثلاً يأخذ النّموذج بالحسبان فقط الرّموز التي تتجمّع احتمالاتها حتّى 90%. الأفضل ضبط معامل واحد في كلّ مرّة — إمّا الحرارة وإمّا top-p — بدلاً من تعديلهما معًا.

القاعدة العمليّة: ابدأ بدرجة 0.3 وعدّل بناءً على النّتيجة. إن كان المخرج مملًّا ارفعها، وإن كان متشعّبًا أو يحتوي أخطاء أنزلها.

جرّب بنفسك

شغّل نفس المطالبة بدرجات 0 و0.4 و0.8 و1.0. قارن الإجابات جنبًا إلى جنب. اختر الدّرجة التي تستعملها في الإنتاج لهذه المهمّة وعلّل.

تأمّل

فكّر في إعداد درجة الحرارة الافتراضيّ لديك (أو الافتراضيّ في أدواتك). هل هو مناسب فعلاً لمعظم مهامّك، أم أنّك تتركه يعمل تلقائيًّا؟