Search documentation
karat

+

K

User Documentation ↗
Version 2.0

Limits

Foundry APIs enforce both rate limits and concurrency limits to ensure fair resource allocation for all users:

  • Rate limits: Restrict the number of your requests that Foundry APIs will process per minute.
  • Concurrency limits: Restrict the number of your requests that Foundry APIs will process concurrently, regardless of rate limits.
Rate limitsConcurrency limits
Individual users5,000 requests per minute30 simultaneous requests
Service usersNo request limit800 simultaneous requests

Requests that exceed these limits will be throttled and receive 429 or 503 error responses. Implement retries using exponential backoff in your applications to handle these errors should they occur.

Understanding Limits

The effective limits experienced by users may vary from the values given above due to several factors:

  • Concurrency limits measure only the active processing time on Foundry API servers, not the total duration of a request. Most of a request’s time is spent in network transit or waiting, not in server processing. For example, you might have 300 requests in transit over a slow network, but only 25 actively being processed on the server, remaining within the 30-request concurrency limit.
  • Foundry APIs enforce these limits separately on each replica instance. Because users cannot control which replica handles a request, the effective limits may be higher than the stated values.
  • Some requests may be throttled by internal services, independent of the API limits described above. For example, requests may be limited due to specific ontology, function execution, or AIP agent limits designed to prevent abuse of computationally expensive operations. However, given the API controls described above, internal throttling should be extremely rare. If you experience throttling that is disruptive to the functioning of your application, contact Palantir Support.

We recommend performance testing when working with Palantir SDKs, especially for cases where usage is expected to be high-scale or "spiky". If you need help ensuring that your application performs at scale, contact Palantir Support.