Complete outage due to external providers

Incident Report for Red Sift UK

Postmortem

Summary

On 2025-06-04 at 10:26, Red Sift experienced a service outage affecting multiple products, including Brand Trust, OnDMARC, and Certificates. The disruption was caused by a regional IBM Cloud storage outage in the UK, which affected several IBM Cloud systems and dependent services. The incident led to service unavailability and degraded performance across components reliant on cloud storage.

Root Cause

The incident was caused by a widespread IBM Cloud storage outage in the UK region. IBM reported failures across several of their internal systems and services, which cascaded into storage unavailability for customers in the affected region.

Red Sift relies on IBM Cloud object storage and related subsystems for key operational workflows. The outage caused storage operations to fail, resulting in downstream service disruption across Brand Trust, OnDMARC, and Certificates.

Impact

Brand Trust

  • Service disruptions for features dependent on cloud storage
  • Delayed or failed processing of certain requests

OnDMARC

  • Inability to access or store required assets

  • Certain backend operations stalled or failed

Certificates

  • Certificate processing and issuance workflows impacted
  • Storage operations unavailable, causing service interruptions

Overall Platform Impact

  • Multiple services unavailable between 10:26 UTC and approximately 11:00 UTC
  • No data loss occurred
  • No security issues were observed

Business Impact

  • Temporary disruption to product functionality and customer workflows
  • Monitoring, processing, and certificate-related operations stalled
  • No permanent data integrity issues or breaches
  • Customer-facing reliability temporarily degraded

Timeline

All times in UTC.

  • 10:26 – Alerts triggered indicating widespread storage failures
  • 10:26–10:40 – Investigation initiated; multiple services confirmed impacted
  • 10:40 – Root cause traced to external IBM Cloud storage outage
  • 10:45 – Mitigation efforts initiated; cloud storage dependency disabled where possible
  • ~11:00 – Storage systems began recovering; services starting to stabilize
  • 11:10 – Full service restoration observed across all affected components
  • 11:15 – Incident closed after confirming provider recovery and operational stability

Resolution

Recovery occurred automatically once IBM Cloud restored storage availability in the UK region. Red Sift systems resumed normal operation without requiring deep infrastructure-level changes. After disabling cloud storage dependencies where feasible, service performance stabilized as the upstream provider resolved the issue.

Customer Impact Mitigation

  • Continuous monitoring ensured rapid detection and response
  • Cloud storage functionality was disabled where safe to reduce impact
  • No customer data was lost or compromised
  • Service recovery verified before incident closure
  • Transparent communication was maintained throughout the incident

We sincerely apologize for any inconvenience caused by this outage. Although the root cause was an external failure outside our control, we remain committed to improving our resilience and ensuring reliable, uninterrupted service for all customers.

Posted Nov 28, 2025 - 12:03 UTC

Resolved

The incident appears to have been resolved. We are operational again
Posted Jun 04, 2025 - 13:12 UTC

Update

We are continuing to work on a fix for this issue.
Posted Jun 04, 2025 - 11:27 UTC

Update

OnDMARC application is also partially restored
Posted Jun 04, 2025 - 11:15 UTC

Update

Services have been partially restored for Certificates and Brand Trust
Posted Jun 04, 2025 - 11:06 UTC

Identified

The issue has been identified. We are investigating alternative solutions to restore some of our services
Posted Jun 04, 2025 - 11:04 UTC

Investigating

We have an issue with our storage system, impacting our apps and backend
Posted Jun 04, 2025 - 10:34 UTC
This incident affected: OnDMARC (Web Application and APIs, Infrastructure), Brand Trust (Web Application and APIs, Infrastructure), and Certificates (Web Application and APIs, Infrastructure).