At 4:55PM EST we suffered a hardware outage on a critical component. We believed that this component would failover automatically to it's redundant servers, but this did not happen as initially designed.
After investigation, we altered the configuration of our application to point to the redundant server and re-deployed our entire application to use this updated configuration.
Service restoration started around 5:25PM EST with full services restored around 5:30PM EST as the remaining servers were deployed.
We're going to identify the cause of the outage, resolve the automated failover issue and do a full service discovery to identify any other single points of failure in our application so we can plan work to resolve them.
Posted Oct 01, 2018 - 14:59 PDT
The issue has been identified and a fix is being implemented.
Posted Oct 01, 2018 - 14:30 PDT
We are currently investigating this issue.
Posted Oct 01, 2018 - 14:06 PDT
This incident affected: ReCharge API, ReCharge Admin panel, ReCharge Checkout, and ReCharge Marketing website.