Overview

As outlined in our general annoucement (duplicated below), XMission suffered a power failure in our primary data center at 2:00 pm Nov. 11, 2008. We are deeply sorry for the frustration and hassle this has caused our customers. Furthermore, we are working as quickly as possible to restore services to those of you who are still affected.

If you have specific questions or concerns, please contact support. Email: support@xmission.com Phone: 1-801-539-0852.

Reload this page for status updates.

Status: Wed Nov 12 11:56:11 MST 2008

Most of our services are restored, but we are still experiencing random issues with Zimbra based email and xmission.com email filters. We are resolving these problems as we find them. Please call us if you are experiencing a problem with email.

Status: Wed Nov 12 09:32:00 MST 2008

Messages sent to mailing lists (something@mailman.xmission.com) in the last 12 hours have been queued, but not delivered. I am flushing the mail queues now, and you should expect to see list traffic pick up within the next hour or so. --jason

Status: Wed Nov 12 09:26:02 MST 2008

FTP directories are reconstructed. Most permissions should be OK if you are using ftp.xmission.com. Other permissions may need case-by-case attention.

Status: Wed Nov 12 08:51:29 MST 2008

We are rebuilding the skeleton directory structure required for customers to login to ftp.xmission.com. Until this is done, you may receive 'Permission Denied' errors when connecting to ftp. This should be remedied shortly.

Status: Wed Nov 12 08:09:54 MST 2008

After yesterday's outage, some email has been delayed, both inbound and outbound. We are working as quickly as possible to catch up. --Mike

Status: Wed Nov 12 07:19:32 MST 2008

/home/ftp/users/* is currently missing, we are working with our storage vendor to determine the best path forward for data recovery. --jason

Status: Wed Nov 12 07:16:29 MST 2008

All user files have been restored. --jason

shell.xmission.com services have been restored --jason

Status: Wed Nov 12 07:11:53 MST 2008

About 100 'd' users left.--jason

Status: Wed Nov 12 07:03:33 MST 2008

About 400 'd' users left.--jason

Status: Wed Nov 12 06:55:29 MST 2008

'r' is complete. 'd' is making progress and is the last restore to complete. All other accounts are now restored. --jason

Status: Wed Nov 12 06:35:01 MST 2008

Just to clarify, email services are not currently affected by this outage.

Hosting is nearly restored and shell services will be up within the hour. --Mike

Status: Wed Nov 12 06:19:02 MST 2008

's' has finished --Mike

Status: Wed Nov 12 06:15:58 MST 2008

We are waiting for all restores to complete before remounting /home on the shell server. ETA 1 hour.

98% of all web sites have now been restored and are functional. --Mike

Status: Wed Nov 12 05:43:00 MST 2008

'c', 'd', 'r', and 's' still in progress. Approaching pre-outage disk consumption. --Brad

Status: Wed Nov 12 04:54:00 MST 2008

'p' users have been restored. 's', 'd', 'c', and 'r' are all in progress.

Status: Wed Nov 12 03:46:21 MST 2008

79% of hosted sites have been restored. --jason

Status: Wed Nov 12 03:46:21 MST 2008

We are doing our best to answer support mail as quickly as possible. Please bear with us. --Mike

Status: Wed Nov 12 03:33:43 MST 2008

'm' and 'k' have just finished. --jason

Status: Wed Nov 12 03:23:11 MST 2008

'j' has just finished, 'q' is starting, 'r' and 'd' are up next. --jason

Status: Wed Nov 12 03:14:02 MST 2008

Given the current rate of restore (700MB/min), we think we should have restored all files around 7am Nov 12, 2008. --jason

Status: Wed Nov 12 03:06:03 MST 2008

Usernames that start with 'h' should now be restored, and 'p' is just staring (we have multiple parallel restores running.) --jason

Status: Wed Nov 12 02:48:29 MST 2008

We have discovered that /home/ftp/users/* tree was not included in our backups. This means that any files that were uploaded via anonymous ftp, or placed in your ~username/ftp/* directories have been lost. When business hours begin tomorrow, we will be working with our storage vendor to obtain a new unit to see if we can force the failed drives back into the raidgroup, allowing us to recover these files. We will send further direct annoucnements regarding this specific issue as we have more information. I'm deeply sorry for any of you who have been affected. We will be answering support mail through the morning to address any direct concerns. --jason

Status: Wed Nov 12 02:38:43 MST 2008

News (nntp) services have been restored. --jason

Status: Wed Nov 12 02:01:21 MST 2008

Currently impacted services:

We are currently streaming backups of /home onto the backup NetApp. Customers with usernames beginning with: '0 1 2 3 4 5 6 7 8 9 a b c e f g h i j k l m n o s t u v w x y z' are either completely restored, or in the process of being restored.

As your home directory is restored and permissions are fixed, services associated with your account will return to normal. --jason

General Announcement: November 11, 2008 10:28:44 PM MST

Subject: ANNOUNCEMENT: XMission Outage

XMission Outage
---------------

XMission experienced a serious outage while we were performing some standard
UPS maintenance today. The outage affected all services and started at
approximately 2:00 p.m. on Tuesday, November 11th. Network services for many
were partially restored by about 2:30 p.m. but some other services required a
lot of attention and took much longer.

Details
-------
About 40% of our data center, including our server room, suffered a  power
outage when a technician flipped a mislabled breaker during some  standard
maintenance on one of our 3 UPS units. Although the power outage was momentary,
servers and routers often respond very poorly to losing  power and sometimes
take extensive work to come back up. Unfortunately,  such was the case today
with many systems.

Seriously Affected Systems
--------------------------
* An important router, which some connections and servers rely on, required
extensive attention from our network administrators.

* DNS (Domain Name Service) was sporadic for some customers for over  an hour.

* Email services were down for over 5 hours.

* Web hosting suffered the longest outage because our NetApp storage appliance
which houses all customer files and web sites lost  multiple hard drives. As a
result, we are currently restoring files to our  new NetApp 2020 from our
November 9th backup, which will take many hours yet to complete. We recently
purchased this new NetApp and were  merely days away from getting it online.

Conclusion
----------
Today's outage was exascerbated by multiple systems responding  poorly to
losing power. In spite of the holiday, our systems administrators  were on site
within minutes and continue to work tirelessly to restore all services. In the
end, we should have performed this maintenance on a  day when our systems
administrators were on site because problems can  arise no matter how carefully
you proceed.