FI-HEL1: Outage
Incident Report for UpCloud
Postmortem

We sincerely apologize for the loss of service you had to endure last night between 18:30 and 20:00 EEST in our FI-HEL1 data centre.

While we realise this post mortem will not excuse us for the loss of service, we do wish it explains our actions and what we will do to avoid such issues in the future.

All of UpCloud’s FI-HEL1 servers lost contact to the internet at 18.30 EEST. Our monitoring software alerted us to this immediately and we began to investigate the issue. We updated our Status page on the incident at 18.42 EEST while our staff continued their work.

At about 18.54 EEST we had narrowed this incident down to a distributed denial of service attack.

We mitigated the attack by 19:45 EEST so that 3/4 of the servers were back online. By 20:00 EEST all servers were connected normally.

The attack continued late into the night, but it did not affect our customers due to filtering and restrictions we put in place.

We will be issuing all affected customers compensation according to our Service Level Agreement. We will send out a separate e-mail to all administrative contacts with the amount as we have credited the accounts.

While we cannot pre-emptively prepare to avoid all possible outages, we do prepare ourselves and our corporate processes so we can fix them as soon as possible. During such incidents we try to communicate to our best extent (without affecting repair work) on unaffected third party services, such as our status page and Twitter -account.

You can follow us on Twitter with @upcloudcom and subscribe to service updates at http://status.upcloud.com.

Again, we are truly sorry for the loss of service.

Joel Pihlajamaa
Chief Technology Officer

Antti Vilpponen
General Manager

Posted Apr 08, 2014 - 04:48 UTC

Resolved
The incident is now resolved and we will be issuing a post mortem on the incident as soon as possible.

We sincerely apologize for the loss of service.
Posted Apr 07, 2014 - 18:50 UTC
Monitoring
We have resolved the network outage and all servers should be now connecting normally. We are still monitoring the situation.
Posted Apr 07, 2014 - 17:04 UTC
Update
We have managed to bring partial connectivity to the site with some virtual servers now back online. We are working to bring back full connectivity.
Posted Apr 07, 2014 - 16:54 UTC
Identified
We've identified this to a networking issue. We're working on this right now.
Posted Apr 07, 2014 - 15:54 UTC
Investigating
We're investigating a major outage at our FI-HEL1 data centre. We will update you as soon as possible.
Posted Apr 07, 2014 - 15:44 UTC