low severityPromptLayer SDK (Python wrapper)
Increased end-to-end inference latency (e.g., +100-500ms per request), higher p95/p99 tail latencies, potential timeouts under load when using promptlayer_client.run() or log_request() synchronously.
Root cause
Synchronous HTTP requests to PromptLayer API for logging/tracking add network round-trip latency (~50-200ms per request) during inference critical path, especially if rate-limited or under high load. Prompt fetching adds control-plane dependency.
PromptLayerlogginglatencynetwork-overheadasync