SP-API rate limits punish naive integrations. How Amazon throttles requests, what the limits actually are, and how to design ingestion that scales.
Amazon’s Selling Partner API throttles aggressively. Most teams that build their own integrations discover this the hard way — a job that worked fine in development hits 429 errors in production, retries make it worse, and ingestion stalls. Working within rate limits is half the engineering work behind any real Amazon data layer.
This guide is about what the limits are and how to design around them.
TL;DR: SP-API uses dynamic per-endpoint rate limits with a token bucket model. Limits vary by endpoint, by seller account size and by Amazon’s server load. Production ingestion needs request queueing, exponential backoff on 429 errors, and concurrency-aware scheduling per endpoint. Naive parallel calls hit limits within seconds. The right architecture treats rate limits as a hard constraint and orchestrates accordingly.
Each endpoint has its own rate limit, expressed as requests per second and a burst quota. Amazon uses a token bucket model: you accumulate tokens up to a maximum, and each request consumes one. When the bucket is empty, requests get 429-throttled.
Limits are dynamic. They depend on:
Posted rate limits are a baseline. Real limits drift.
Reports take time to generate. The constraint is not just request rate but report queue depth. Pulling many reports concurrently fails because Amazon queues your jobs.
High frequency expected here for repricing, so limits are higher — but still hit by aggressive polling on large catalogs.
Tight limits per request. Pulling catalog data for thousands of ASINs needs careful batching.
Reasonable limits, but pagination and date-range filtering matter.
Result: hours-long ingestion windows, missing data, and silent failures.
Each endpoint has its own queue with its own concurrency limit. Hit a 429 on one endpoint and only that endpoint slows down.
First retry after 1 second. Then 2. Then 4. Then 8. Reset on success.
Track success rate per endpoint over short windows. If error rate climbs above a threshold, lower concurrency proactively.
Long-running ingestion (full backfill, full catalog refresh) needs checkpointing. Crash mid-job, resume from last checkpoint, do not restart.
Real-time customer-facing requests get priority. Batch backfill yields when high-priority work arrives.
This infrastructure is not exciting. It is also not optional. Teams that build their own SP-API integration spend significant engineering time on:
It is a multi-month project that has to be maintained as Amazon changes limits.
SP-API rate limits are the unglamorous reason most home-grown Amazon integrations stall. The architecture to handle them properly is well understood but real engineering work — the kind that should run on top of a maintained data layer instead of being rebuilt by every team.
DataDoe handles SP-API rate limit orchestration, retries, queueing and backfills as part of the Amazon data layer so your team can query clean data instead of fighting throttling.
Every integration. Full onboarding support. If it’s not the best decision you made in 2026, you can cancel anytime.
Know what makes you money
Catch problems instantly
Connect anything with API & MCP
Replace tools with your own apps
Access Amazon-audited infrastructure