Delayed event processing
Resolved
Nov 05 at 03:54pm CET
Today, between 12:54 CET and 13:18 CET, we observed database connection issues in our asynchronous event processors in one of our hosting zones. Due to those issues, we stopped our workers which caused the previously reported delays. Before we scaled down the worker instances, we had a window of approximately 10 minutes in which events that were accepted on the API, were not processed by the worker and dropped with an error. This should affect about 1/3 of events that were send to our EU instance within that timeframe.
We added additional error handling to record and store failed events which allows us to replay them in the future instead of dropping them on errors.
We apologize for any inconvenience.
Affected services
[EU] Trace Ingestion
Updated
Nov 05 at 01:38pm CET
We found the root cause and provided a fix. Everything should be working as expected now.
Affected services
[EU] Trace Ingestion
Created
Nov 05 at 01:00pm CET
We identified an issue with our database and observe delayed event processing. We are working on finding the root cause and providing a fix.
Affected services
[EU] Trace Ingestion