The app is temporarily unaccessible
Incident Report for Customer.io
Postmortem

Overview

Between 2018-11-19 14:44 UTC and 14:52 UTC there was a disruption in message sending and all accounts and users were unable to login and access the management interface. No data was lost during this outage. Messages were not sent during this time window but were sent after 14:52 UTC when the incident was resolved.

Incident Timeline

  • 14:44 UTC - a database migration to manage internal data begins and completes a minute later.
  • 14:47 UTC - the SRE team is notified of issues by our alerting system and begins investigating.
  • 14:49 UTC - the issue is identified by the SRE team.
  • 14:51 UTC - the issue is resolved by the backend team.
  • 14:53 UTC - we publish a status to notify our customers of the issue.
  • 14:55 UTC - we update the status notifying our customers that the issue is resolved.

Root Cause

We performed a SQL database migration to drop a column that we believed was no longer in use. This column was in fact being referenced by parts of our application code resulting in failed SQL queries after the migration, causing the observed failures.

Resolution

The migration was reversed to restore the column temporarily. Our application code was corrected to stop referencing the legacy column to allow this migration to proceed as planned. We will conduct an internal investigation of this issue and make appropriate improvements to our systems to help prevent or minimize future recurrence during database migrations.

Posted 9 months ago. Nov 19, 2018 - 22:46 UTC

Resolved
This incident has been resolved.
Posted 9 months ago. Nov 19, 2018 - 14:55 UTC
Update
Sorry for the inconvenience. This has been fixed.
Posted 9 months ago. Nov 19, 2018 - 14:55 UTC
Investigating
Following a recent deploy, the UI became inaccessible. Our engineering team is on the case and we will provide another update by 3:30 pm UTC.
Posted 9 months ago. Nov 19, 2018 - 14:53 UTC
This incident affected: Management Interface.