Incident Summary
On October 23rd, 2025, between 20:36 UTC and 23:31 UTC customers in both the US and EU regions may have experienced delays in data processing, message sending and user interface sending state notifications. The issue did not affect data ingestion and no data was lost.
During routine maintenance, the engineering team deployed database schema updates intended to support future feature, reliability and performance improvements. These updates unintentionally increased operational load on several database servers, slowing their response until the updates completed or were manually stopped.
Root Cause
An update to the database schemas was performed across all databases, resulting in a modification to an existing database table. Due to the number of databases and the frequency of updates, the database engines became overwhelmed managing the internal metadata changes. The update lacked throttling or short delays between operations, which would have reduced load and prevented performance degradation.
Resolution and Recovery
Service performance returned to normal once the schema updates finished. One database server was proactively restarted to expedite full recovery. Following stabilization, the team identified the update responsible and confirmed no residual impact or ongoing risk.
Corrective and Preventative Measures
To prevent recurrence, the team is enhancing the schema update process to include built-in throttling and guardrails that limit database load. Code templates will be updated accordingly, and review processes will highlight the need for these safeguards. Monitoring improvements are also being implemented to better detect early signs of database strain and alert engineers sooner. These measures are being tracked and prioritized within our internal development process.