Langfuse

The underlying incident with our infrastructure (https://statuspage.incident.io/clickhousecloud/incidents/01KT1G25S9PBKM7VJEB146680G) has been resolved.

May

Thu

[US] Delayed LLM as a judge evaluation

5:13 PM

We've fully caught up and process LLM as a judge in realtime again.

Wed

LLM as a Judge execution is delayed

6:36 PM

We have scaled our infra to drain the queue backlog. Evals are executed without delay again.

Tue

Delayed LLM as a judge execution

9:34 PM

The queue backlog has been fully processed for legacy trace and dataset targeted evals. There is no longer a delay for eval executions.

Thu

UI loads slowly and returns errors

11:46 AM

We observe full recovery across all API and UI routes.

Wed

[EU] Elevated latencies and timeouts

9:07 AM

Latencies and error rates have recovered.

Sat

[EU, US, HIPAA, JP] Infrastructure Maintenance

8:30 AM

Maintenance has completed

Fri

Errors and latencies on ingestion APIs

10:35 AM

The underlying issue was resolved.

Thu

API errors and latencies

3:09 PM

We resolved the underlying issue. A distributed cache roll out for ClickHouse caused the latency and error spikes. The roll out was reverted and we see normal API performance as of now.

Wed

[EU] Elevated processing times

9:02 AM

We caught up with our queues and have regular processing times again.

Tue

Delayed OTEL ingestion

9:50 PM

All data is processed in time.

April

Wed

[EU + US] Elevated errors for API reads

4:02 AM