There are no network issues
All servers are up and running
04.24.2019 19:01 EST BARRACUDA server outage investigation results:
Cascade of events
1. 04.23.2019 17:13 EST power cord attached to our private rack in Dallas data center (DC) has failed (nobody was aware of this until 04.23.2019 23:55 EST);
2. 04.23.2019 17:17 EST WestNIC staff received auto notification call from external monitoring station that BARRACUDA server went down.
3. 04.23.2019 17:29 EST Traceroute results suspect network congestion between Dallas data center and upstream providers: Stackpath and Cogent.
4. 04.23.2019 17:51 EST Network issues have been confirmed at data center.
5. 04.23.2019 17:51 EST - 04.23.2019 19:20 EST contact exchanges between DC technicians, upstream providers and our staff in New York.
6. 04.23.2019 20:01 EST Emergency maintenance has been declared: backup files (dated 04.20.2019) moved from offsite server in North Virginia to New York.
7. 04.23.2019 22:41 EST New server in Dallas is licensed by CloudLinux / cPanel and ready for account restore.
8. 04.23.2019 22:45 EST Data transfer from New York to Dallas has been initiated (offsite backups).
9. 04.23.2019 22:51 EST Multiple attempts to power up server rack and run FSCK (file system check). FSCK freezes 2 times because of power issues.
10. 04.23.2019 23:55 EST Data center engineer, sent to check server hardware, noticed unexpected shutdown of entire rack.
11. 04.24.2019 00:10 EST Offsite backup transfer has been completed.
12. 04.24.2019 00:35 EST Power cord has been replaced. Network switches, firewalls and server were rebooted after unexpected shut down.
13. 04.24.2019 02:09 EST FSCK has been finished on BARRACUDA server. SSD disks are OK.
14. 04.24.2019 02:11 EST BARRACUDA server has been rebooted.
15. 04.24.2019 02:14 EST no errors found after reboot. System is online.
16. 04.24.2019 02:15 EST auto notification call from external monitoring station that server is online.
17. 04.24.2019 03:10 EST we didn't find any system corruption nor file loss. Server is online and stable.
18. 04.24.2019 03:10 EST emergency restore has been cancelled, an outage investigation initiated.
In order to prevent such issues in the future, WestNIC made or will make following improvements:
1. In case of complete power loss / network loss in Dallas data center, upgrade support desk to Cloud Desk by June 1st, 2019
2. Regular (monthly) DC inspections by WestNIC staff. 3. DC assigned to us new 24/7 emergency contact. 4. Offsite backup system is insufficient. Local (same data center) weekly backups are needed. 5. Online chat was operational during emergency event. However, we need to populate other support channels like Twitter, Facebook and Discord. 6. New, strict SLA (Service Level Agreement) policy is needed.
Thank you for your patience and cooperation during server outage. We do apologize for the inconvenience.
04.23.2019 2:09 EST Access to BARRACUDA server has been restored with third manual FSCK on primary SSDs. It seems that server rack lost the power and then got stuck after reboot. We're going to investigate this issue closely to prevent such long outages in the future. We do appreciate your patience and cooperation during this event. "Barracuda" server was online since October 2015 with excellent uptime.
04.23.2019 21:57 EST Unfortunately, BARRACUDA server got stuck during FSCK. System files may have been corrupted. We're analyzing disk partitions and preparing for full restore. ETA will be set right after SSD replacement / OS reload.
04.23.2019 20:25 EST BARRACUDA server didn't come after power failure in data center (stuck at auto FSCK). We're running manual FSCK. It may take up to 2 hours. Thank you for your patience.
04.23.2019 17:27 EST BARRACUDA server isn't accessible. We're investigating this outage. We do apologize for the inconvenience. ETA: unknown.
03.24.2019 21:41 EST Secondary DNS server (ns2.westnic.net) has been upgraded. New IP: 18.104.22.168. If you use own name servers ("private DNS"), please replace old IP 22.214.171.124 with new one: 126.96.36.199 at your domain registrar (host records).