low severityPromptLayer SDK (Python wrapper)

Increased end-to-end inference latency (e.g., +100-500ms per request), higher p95/p99 tail latencies, potential timeouts under load when using promptlayer_client.run() or log_request() synchronously.

Root cause

Synchronous HTTP requests to PromptLayer API for logging/tracking add network round-trip latency (~50-200ms per request) during inference critical path, especially if rate-limited or under high load. Prompt fetching adds control-plane dependency.

PromptLayerlogginglatencynetwork-overheadasync

Citations

https://docs.promptlayer.com/languages/python