S2 - Intermittent Slowness in SpaceIQ
Incident Report for SiQ
Postmortem

SpaceIQ Detailed Root Cause Analysis – Severity 2 – June 5, 2023 

Intermittent Slowness in SpaceIQ 

 

Description: 

On June 5, 2023, at approximately 10:52am MST, internal teams and customers support started to receive reports of issues throughout the SpaceIQ platform with Intermittent slowness. Some users were experiencing a spinning wheel, or a Network Request failed error when trying access or navigate the platform. 

 

Type of Event: 

Performance Degradation  

 

Services\Modules Impacted: 

Production 

 

Timeline:  

10:52am MST – Internal teams and customers have reports of issues throughout the SpaceIQ platform with Intermittent slowness. Some users were experiencing a spinning wheel, or a Network Request failed error when trying access or navigate the platform. After a short investigation and information gathering to reproduce customers experiences. The initial ticket was escalated to a S2 incident at 1:12pm MST and posted to our status page alerting customers. Internal teams acknowledge the issue and begin investigating. The issue was quickly identified, and a fix was implemented. At 1:47pm MST, customers were alerted on the status page that we are now monitoring the fix. Customers were able to confirm the implemented fix during monitoring. There were no additional reports of slowness, and the status page was marked as resolved at 4:21pm MST.  

 

Total Duration of Event:  

5hrs 29mins 

 

Root Cause Analysis: 

The engineering and dev ops team found that there was an issue with a caching server. A manual restart of the server resolved the issue.  

 

Preventative Action:  

Our dev ops team continues to investigate why the caching server did not restart automatically. They have put new processes in place to ensure that the appropriate teams are also alerted if this happens again.

Posted Jun 15, 2023 - 16:37 UTC

Resolved
As customers confirm resolution and no other reports have been made to SpaceIQ, we are moving this issue from Monitoring to a Resolved phase. We appreciate your patience and will have an Root Cause Analysis in 10 days Business Days.
Posted Jun 05, 2023 - 22:21 UTC
Monitoring
Our Engineering and Dev Ops team has found the root cause and implemented a fix. We will be monitoring this issue for the next 2 hours and will return with another update. We appreciate your patience as we worked to find a solution.
Posted Jun 05, 2023 - 19:47 UTC
Investigating
We are currently investigating an issue with Intermittent slowness at login of the platform. Some may see a spinning wheel and others will see a network request has failed.
Posted Jun 05, 2023 - 19:12 UTC
This incident affected: System Status.