Incident Date: October 3rd 2024
Incident Duration: Approximately 20 minutes
Affected Services: TrekkSoft API, TrekkSoft Application, POS Desk
Incident Description:
At approximately 12:15 PM CET on October 3rd, 2024, the system went down.
Impact:
The redis node used for session storage from API was rebooted and came back approximately 20 minutes later. The node went out of service outside the maintenance windows. We opened a support ticket with AWS to understand why it was rebooted.
Resolution:
The incident was resolved due to the rebooted redis node (used for session storage from API).
Learnings:
API uses redis-core-production for session storage. This is a one node instance.
Preventive Measures