Monitoring
We are currently still reprocessing data that was not ingested during the incident. We expect all data to be processed in the next hours.
Identified
We’ve identified and resolved the root cause of the issue. All new data is now being successfully ingested, and our APIs are fully operational again. We’ve also restarted processing for the data that was temporarily paused.
Investigating
We’ve temporarily paused ingestion of new SDK data into our ClickHouse database. During this time, recent data will not appear in the UI or APIs. Once the issue is resolved, all queued data will be replayed and fully restored.
Investigating
As the database is under high load, some `INSERT` queries in our database failed. We will make sure to replay failed data once we found and fixed the root cause.
Investigating
We currently observe memory issues in our Clickhouse database. This causes queries to the database to fail for our UI but also APIs trying to retrieve data from Langfuse. We are investigating the issue right now.