TrekkSoft servers issue affecting several production functionalities
Incident Report for TrekkSoft
Postmortem

Root cause
New release containing database changes was being pushed to production which locked access to some database tables. Due to high volume of requests to access those tables, and them being locked, requests for access were piling up in the queue which overloaded our servers.

What happened
For a certain amount of time (estimated at 45 min) users were not able to access their TrekkSoft sites due to our production servers being down.

What we did
When our developers identified the new release is impacting the performance of our services and overloading the servers they had to restart those and return them to their normal functioning state with a rollback.

The consequences

During the 45 min time window of the incident no bookings have been able to be processed.

Learnings

In order to prevent similar incidents in the future we updated our internal guidelines on new releases to avoid releasing critical changes during peek hours when high availability is needed.

We apologize for any inconvenience this might have caused you.

Posted Jul 07, 2021 - 16:29 CEST

Resolved
Issue with the TrekkSoft servers has been resolved.
All the affected services are back working as expected.
Our developers are continuing to monitor the issue closely to ensure there are no further performance issues.
Posted Jul 06, 2021 - 20:08 CEST
Investigating
We are currently experiencing issues with Trekksoft servers. This is affecting most of our product functionalities.
Our developers are investigating with highest priority to identify the root cause of the issue and it’s resolution.
We will keep you updated and apologize for the inconvenience caused.
Posted Jul 06, 2021 - 19:28 CEST
This incident affected: TrekkSoft Application, TrekkSoft API, Backend Mobile Applications, POS Desk, and TrekkSoft Website.