medium severityGroq API, langchain-groq ChatGroq
APITimeoutError raised after ~60 seconds on long prompts (>few thousand tokens), request hangs or times out during pre-fill before first token, no response despite fast models usually. Works for short prompts.
Root cause
Long prompts increase pre-fill (prompt processing) time linearly, leading to Time to First Token (TTFT) exceeding the default 60-second client timeout during the KV cache computation phase. Flex tier adds rapid server-side timeouts under resource constraints. [Groq Python SDK](https://github.com/groq/groq-python) [Groq Latency Docs](https://console.groq.com/docs/production-readiness/optimizing-latency)
Groqlangchain-groqAPITimeoutErrorprefillTTFTservice_tierlong-context
Citations