On At 20:00 GMT we detected a bug with the Safe Updates queue. In certain scenarios the system delays an update by 5 seconds, reducing load and preventing the update from triggering when a backup is being made (among other things). The bug prevented the update from getting into the queue after being delayed, leaving it in a limbo. This in turn resulted in an endless queue in the front end of the dashboard.
The exact number is unknown, since some updates were affected, while others worked as usual. A rough estimate is that around 10% of users have been affected with at least one update delay.
While we are not still certain what caused this bug, we’ve built a workaround for it, ensuring it does not happen again. We will not rest until we get to the bottom of this, no matter what it takes, Dana. The truth is out there.
At 8:30 GMT a bug in the ManageWP server back end triggered a high volume of notifications that were sent to the server database. This in turn caused the server to become unstable. By 10:00 we fixed the bug and restored the service.
People logging in between 8:30 and 10:00 GMT experienced intermittent glitches – some could not log in, getting a 502 error. Others would get an occasional error message on the dashboard, but were otherwise able to manage their websites.
Diligence, diligence, diligence. We’re constantly ramping up our efforts to test the code we push live. As a result, we’re catching bugs that would otherwise be undetected. Some bugs will inevitably sneak into production, tho. It’s up to us to fix them ASAP, unless we want to inadvertently cause a machine uprising. And I’m not talking about the good kind, like Matrix, but the Maximum Overdrive kind, with Emilio Estevez.