high severityAmazon Bedrock Runtime (invoke_model, converse APIs)

ThrottlingException (HTTP 429) on InvokeModel/Converse calls after initial success: 'An error occurred (ThrottlingException) when calling the InvokeModel operation: Too many requests/tokens, please wait before trying again.' Low token usage but hits RPM limit (e.g. 1/min). [AWS Docs](https://docs.aws.amazon.com/bedrock/latest/userguide/troubleshooting-api-error-codes.html) [Stack Overflow](https://stackoverflow.com/questions/79420215/aws-bedrock-throttling-exception-when-using-sonnet-3-5-sonnet)

Root cause

Exceeding Amazon Bedrock's model-level service quotas: RPM (requests per minute), TPM (tokens per minute with upfront deduction of input + max_tokens, adjusted post-response; output × burndown rate e.g. 5x for Claude), TPD. Quotas shared across InvokeModel/Converse/Stream APIs; low defaults on new accounts (e.g. 1 RPM for Claude 3.5 Sonnet). [AWS Token Burndown](https://docs.aws.amazon.com/bedrock/latest/userguide/quotas-token-burndown.html)

bedrockThrottlingExceptionRPMTPMboto3claude-3.5-sonnetquotas

Citations