S1 - Inability to Access SpaceIQ
Incident Report for SiQ
Postmortem

SpaceIQ by Eptura detailed Root Cause Analysis | 02/25/2024

 

We are truly grateful for your continued support and loyalty. We value your feedback and appreciate your patience as we worked to resolve this incident.

 

Description:

On the evening of Sunday, February 25, 2024, at approximately 11:55pm EST. Multiple users reported the inability to access the SpaceIQ platform.

 

Type of Event:

Outage

 

Services/Modules impacted:

Production

 

Timeline:

On the evening of Sunday, February 25, 2024, at approximately 11:55pm EST. Multiple users reported the inability to access the SpaceIQ platform. Internal channels were alerted and at 12:10am EST, 02/26/2024, the CloudOps team acknowledged the issue and began investigation. At approximately 12:54am EST, the issue was resolved, and users were able to access the platform.

 

Total Duration of Event:

1 Hour

 

 

Root Cause:

The DB had reached the max storage threshold.

 

Remediation:

An additional 1 TB of space was added to the max threshold to resolve the issue.

 

Preventative Action:

  • RDS Instance has been added to LogicMonitor for additional Monitoring of the core metrics.

  • Meeting setup with Dev teams to review the usage pattern and see if the logging requirement can be reviewed\optimized.

Posted Mar 11, 2024 - 22:08 UTC

Resolved
On the evening of Sunday, February 25, 2024, at approximately 11:55pm EST. Multiple users reported the inability to access the SpaceIQ platform.
Posted Feb 26, 2024 - 05:30 UTC