Batches

Not every API call needs a response in 2 seconds. Batch processing lets you submit up to 100,000 requests at once, get results within 24 hours, and pay 50% less than real-time pricing.

When to batch

Use batches for workloads that are:

Latency-tolerant: results needed in hours, not seconds
High-volume: dozens to thousands of items
Independent: each request doesn't depend on the previous one's result

Examples: document classification, data extraction from 500 PDFs, running eval sets, content moderation at scale, translating a backlog of articles.

Submitting a batch

const batch = await client.messages.batches.create({
  requests: items.map((item, i) => ({
    custom_id: "item-" + i,
    params: {
      model: "claude-sonnet-4-6",
      max_tokens: 256,
      messages: [{
        role: "user",
        content: "Classify this text: " + item.text,
      }],
    },
  })),
});

console.log(batch.id); // msgbatch_abc123

Each request in the batch gets a custom_id you choose — use it to match results back to your inputs.

Polling for results

Batches are async. Poll for status:

let status = await client.messages.batches.retrieve(batch.id);

while (status.processing_status === "in_progress") {
  await new Promise(r => setTimeout(r, 30_000)); // 30s interval
  status = await client.messages.batches.retrieve(batch.id);
}

// Stream results
for await (const result of client.messages.batches.results(batch.id)) {
  if (result.result.type === "succeeded") {
    const content = result.result.message.content[0].text;
    // Process result for result.custom_id
  }
}

Handling failures

Some requests in a batch may fail while others succeed. Always check result.result.type:

succeeded — normal response
errored — the request hit an API error
expired — the batch timed out before processing this request

Build your pipeline to handle partial results. Store successes, retry failures separately.

Cost comparison

Method	Input cost (Opus)	Speed	Best for
Real-time	$5 / MTok	Seconds	User-facing features
Real-time + caching	$0.50 / MTok cached	Seconds	Repeated prompts
Batches	$2.50 / MTok	Hours	Bulk processing
Batches + caching	$0.25 / MTok cached	Hours	Maximum savings

On Opus 4.7, batches also support extended output (up to 300K tokens per response with the output-300k-2026-03-24 beta header).

What's next

You've covered every API capability. The final lesson puts it all together: going to production with logging, monitoring, fallbacks, cost guardrails, and a kill switch.

ليس كلّ استدعاء API يحتاج استجابة في ثانيتين. المعالجة الدّفعيّة تتيح إرسال حتّى 100,000 طلب دفعة واحدة، والحصول على النّتائج خلال 24 ساعة، والدّفع 50% أقلّ من السّعر الفوري.

متى تُستعمل الدّفعات

استعمل الدّفعات لأعمال: تتحمّل التّأخير (نتائج تُحتاج في ساعات لا ثوانٍ)، كبيرة الحجم (عشرات إلى آلاف العناصر)، مستقلّة (كلّ طلب لا يعتمد على نتيجة السّابق).

أمثلة: تصنيف المستندات، استخراج البيانات من 500 ملفّ PDF، تشغيل مجموعات التّقييم، إشراف المحتوى على نطاق واسع، ترجمة رصيد من المقالات.

إرسال دفعة

كلّ طلب في الدّفعة يحصل على custom_id تختاره — استعمله لمطابقة النّتائج مع مدخلاتك.

الاستطلاع للنّتائج

الدّفعات غير متزامنة. استطلع الحالة كلّ 30 ثانية حتّى تكتمل. ثمّ دفّق النّتائج عبر مُتكرّر غير متزامن.

بعض الطّلبات في الدّفعة قد تفشل بينما تنجح أخرى. تحقّق دائمًا من result.result.type: succeeded (استجابة طبيعيّة)، errored (خطأ API)، expired (انتهت مهلة الدّفعة قبل معالجة هذا الطّلب). ابنِ خطّ أنابيبك لمعالجة النّتائج الجزئيّة. خزّن النّاجحة، أعد إرسال الفاشلة منفصلةً.

مقارنة التّكلفة

على Opus 4.7: الفوري $5 لكلّ مليون رمز مدخل، الفوري مع تخزين $0.50 للرّموز المخزّنة، الدّفعات $2.50، الدّفعات مع تخزين $0.25. أقصى توفير حين تجمع الدّفعات مع التّخزين.

ما التّالي

غطّيت كلّ قدرات الواجهة. الدّرس الأخير يجمع كلّ شيء: الذّهاب إلى الإنتاج مع السّجلّات والمراقبة والبدائل وحواجز التّكلفة ومفتاح الإيقاف.

Not every API call needs a response in 2 seconds. Batch processing lets you submit up to 100,000 requests at once, get results within 24 hours, and pay 50% less than real-time pricing.

When to batch

Use batches for workloads that are:

Latency-tolerant: results needed in hours, not seconds
High-volume: dozens to thousands of items
Independent: each request doesn't depend on the previous one's result

Examples: document classification, data extraction from 500 PDFs, running eval sets, content moderation at scale, translating a backlog of articles.

Submitting a batch

const batch = await client.messages.batches.create({
  requests: items.map((item, i) => ({
    custom_id: "item-" + i,
    params: {
      model: "claude-sonnet-4-6",
      max_tokens: 256,
      messages: [{
        role: "user",
        content: "Classify this text: " + item.text,
      }],
    },
  })),
});

console.log(batch.id); // msgbatch_abc123

Each request in the batch gets a custom_id you choose — use it to match results back to your inputs.

Polling for results

Batches are async. Poll for status:

let status = await client.messages.batches.retrieve(batch.id);

while (status.processing_status === "in_progress") {
  await new Promise(r => setTimeout(r, 30_000)); // 30s interval
  status = await client.messages.batches.retrieve(batch.id);
}

// Stream results
for await (const result of client.messages.batches.results(batch.id)) {
  if (result.result.type === "succeeded") {
    const content = result.result.message.content[0].text;
    // Process result for result.custom_id
  }
}

Handling failures

Some requests in a batch may fail while others succeed. Always check result.result.type:

succeeded — normal response
errored — the request hit an API error
expired — the batch timed out before processing this request

Build your pipeline to handle partial results. Store successes, retry failures separately.

Cost comparison

Method	Input cost (Opus)	Speed	Best for
Real-time	$5 / MTok	Seconds	User-facing features
Real-time + caching	$0.50 / MTok cached	Seconds	Repeated prompts
Batches	$2.50 / MTok	Hours	Bulk processing
Batches + caching	$0.25 / MTok cached	Hours	Maximum savings

On Opus 4.7, batches also support extended output (up to 300K tokens per response with the output-300k-2026-03-24 beta header).

Batches

What you'll learn

When to batch

Submitting a batch

Polling for results

Handling failures

Cost comparison

What's next

متى تُستعمل الدّفعات

إرسال دفعة

الاستطلاع للنّتائج

معالجة الإخفاقات

مقارنة التّكلفة

ما التّالي

Try it yourself

Reflect

Batches

What you'll learn

When to batch

Submitting a batch

Polling for results

Handling failures

Cost comparison

What's next

متى تُستعمل الدّفعات

إرسال دفعة

الاستطلاع للنّتائج

معالجة الإخفاقات

مقارنة التّكلفة

ما التّالي

Try it yourself

Reflect