Between 11:43 AM and 12:14 PM CET, the platform experienced a full service disruption affecting several core services.
The incident was caused by an issue with the distribution of a renewed internal certificate authority used by our clusters in the load balancer layer. This CA was part of the trust chain used to validate encrypted end-to-end communications between internal components. When that validation failed, internal services could no longer establish trusted connections.
This CA was shared across the communication path between our production and backup infrastructure. As a result, traffic redirected to the backup infrastructure was also affected by the same issue.
Our engineering team identified the root cause and applied mitigation measures to restore the platform to normal operation. The service is now fully restored and we continue to monitor system stability.
After reviewing the incident, we have confirmed that some publishing and analytics data affected during the disruption cannot be recovered.
We apologize for the impact this incident may have caused. We are already working on additional safeguards, including stronger end-to-end validation of certificate renewal, distribution, and activation across our load balancer layer; improved alerting for certificate deployment inconsistencies; and further separation between production and backup infrastructure to reduce shared failure domains.