Resolved
We have scaled our infra to drain the queue backlog. Evals are executed without delay again.
Monitoring
We continue to monitor the situation
Identified
We believe only a single customer is affected and it is caused by LLM's provider rate limits. Since we monitor worst delay across our system this initially looked as a bigger issue.
Investigating
We are investigating the root cause