Dave Aiello wrote, "Yesterday's massive denial of service attack, while aimed at Microsoft SQL Server 2000 servers, exposed a lot of other holes in infrastructure, and lacks of redundancy or robustness. I want to cite a few examples from CTDATA's infrastructure because I think they will be illustrative:"
- Lack of meaningful DNS diversity: At the time of the outage, CTDATA's servers had primary and secondary DNS servers located in the same colocation facility. This is a bad idea because yesterday showed that all of the routes from any one facility to the Internet may be overwhelmed with traffic simultaneously, even if they go through different ISPs.
- Lack of local mail relays for critical network services: The network monitoring service that we run does not have an SMTP server on the same subnet. This means that we depend upon one of the SMTP servers that we are attempting to monitor to email our outage alerts to us.
This also became an issue for our firewalls, because they mail their logs to administrators as they fill up. When huge amounts of traffic hit the firewalls, many events were logged, filling up the memory quickly. Those logs could not be emailed because of the network failure. So, we probably lost a good amount of information about the attack as it was occuring.
Dave Aiello continued, "We knew about these infrastructure issues, but haven't been able to deal with them expeditiously because they require more server resources than we have available and can afford at the moment."
"Although our firewalls prevented the attack from reaching our servers, we still experienced total loss of connectivity for about 10 hours. The connectivity loss is attributable to routers at ISPs upstream from our servers. Those routers simply went down when massive amounts of traffic hit them. When CTDATA's servers came back on-line, I received over 700 email messages within an hour, mostly from servers that had the ability to queue their error and alert message in memory until the email servers came back on-line."
"I object to articles like Massive Internet Outage was Preventable from the UPI because it gives people the impression that attacks like these are predictable, easy-to-understand, have straightforward solutions, and only have obvious side effects. Nothing could be further from truth."