Posts Tagged ‘Data Center


Datacenter Dynamics 2011 in Dallas

Datacenter Dynamics held their annual Dallas conference on November 1, 2011 in Richardson.  I was very fortunate to be a guest of a product vendor, PDI. As an aside, I was very impressed with their PowerWave Bus System, it is a very slick system,  but that will have to be a post for another time.   Several talks were given, and all of them were educational, if not something of an advertising opportunity.  My personal favorite was a presentation by Cisco on their new data centers in North Texas, given by a Mr. Tony Fazackarley.  What I found most enjoyable about this talk was the holistic description of this new facility, and why Cisco made some of the choices that they did with respect to cooling, backup power and disaster recovery.

The Cisco facility that was the topic of the presentation made use of a direct cooling scheme rather than a traditional raised access floor layout with remote CRAC/CRAH unit cooling under the floor.  If I understood the diagrams being presented correctly, the cooling is supplied from above and allowed to ‘fall’ into the cold aisles to supply the air required.  The cabinets deployed had chimneys to send hot air directly into the upper space of the data hall, but there was no ceiling, just a support grid structure.  This hot air is allowed to return to an AHU or vent directly outside during economizing. This lines up with current industry trends to supply the cold air to IT equipment as closely to the IT load as possible, reducing the power consumption by fans to pressurize an underfloor plenum.

The next area that I found intriguing in this facility was the choice in backup power for the data center.  This particular facility opted for rotary UPS systems paired with diesel generators in lieu of a traditional static switch UPS.  One of the advantages mentioned for this system is that power outages of short duration can drain battery strings more quickly reducing their life beyond design, whereas the rotary system will continue proper functioning without reduction in useful life.  In my experience with data center design, I have not had the chance to see these systems deployed by a client.  Most opt for the static UPS paired with batteries.  When I asked a colleague about the mechanical complexity of these rotary systems and the increased risk exposure in downtime for a system when compared to a battery plant replacement, he was very confident that these systems are very robust, and that while parts are going to go bad or break down, maintenance was a simple procedure.  From the presentation, it sounds like the major drawback was the noise generated by the engines that constantly run to turn the rotary system (I believe he mentioned a constant noise level of 110db).

Another interesting area of discussion was around Cisco’s disaster recovery, where many of their data centers are paired for redundancy, and smaller existing sites were converted into disaster recovery sites for critical processes.  Care is taken in site selection to ensure that a singular event will not likely take out both facilities.  All told it was a very informative presentation offering a lot of insight into how Cisco is handling its facility site selection, tier ratings and best practices.  I hope to have more posts from this year’s conference after I have a chance to review my notes (There was quite a lot!)



EMP Attacks

It’s fair to say that after the attacks on September 11th, 2001, our discussions on security changed forever.  I personally recall never having conceived of attacks of that nature prior to that day.  Since then, security has received a new and enthusiastic level of scrutiny.  Many people make a living thinking of scenarios that might seem unimaginable to the rest of us.  They look around and ask, ‘where are our soft spots as a country?’ and critical infrastructure always seems to fit the bill.

The concerns are straight out of a Tom Clancy novel: we are a technologically advanced nation, and we rely to a high degree on electronics and integrated circuitry, and then some rogue force acquires an EMP device to decimate our technology and thrust us back into the stone age.  The topic of Electro Magnetic Pulse attacks has come up in data center design more than once, and it is often a topic of discussion at forums and consortiums on data centers.

First, some history.  The first noted EMP disturbance was actually a by-product of high altitude nuclear detonation tests over the Johnston Islands in the Pacific.  A detonation named ‘Starfish Prime’ caused electrical disturbances in Hawaii several hundred miles away. The physics are complicated, but as a nuclear detonation occurs, the Compton effect causes a kind of major power surge in equipment that usually exceeds the capacity of the conductor to handle.  The result is fried and non-functional circuitry.  Naturally, this effect got the attention of the Department of Defense who saw several potential applications for this effect.  Several tests were conducted until 1963, when the above ground nuclear testing treaty was signed due to concerns over radiation pollution in the Earth’s atmosphere.  No EMP from a nuclear ordinance has been created since.

In spite of the ban, the effects of high altitude detonations was well understood by that time, so DoD standards and specifications were developed to protect sensitive electronics in critical buildings and war machines.  The DoD attempted to build gigantic testing facilities that would simulate this effect, the first being the trestle at Kirtland Air Force Base, another being the EMPRESS system developed by the Navy.  From what I have read, these did simulate the effect, but could not create a power spike on the magnitude of a nuclear weapon.  They were better than nothing, but less than the real thing.

Fast forward to today, the concern is now fresh on the minds of anyone building a critical facility.  If the more robust electronics of the post war era could not stand up to EMP, how could the delicate integrated circuitry of model electronics ever stand a chance? How can we protect our sensitive equipment from this kind of attack?  Well, general consensus today is that a Faraday cage is the best way to protect systems from this effect.  This has manifested itself from the very sensible sheet metal rooms or computer cabinets to the questionable installation of chicken wire into the envelope of the building. It’s here that I would like to make two arguments: 1) You can’t really guarantee that you can protect your equipment for several reasons and 2) with cloud computing taking off, this will probably matter less and less for end users.

Here are the problems with trying to harden a facility against EMP.  First, there really isn’t that much information available to the public about this kind of weapon.  Remember, there has not been a documented EMP event since before 1963, or nearly 50 years.  Second, there is no viable way to test or commission an installation of chicken wire (or any other protection scheme).  This is especially problematic because every penetration into a chicken wire cage is a potential conductor of electricity and could compromise the integrity of the cage.  This means every wire, pipe, duct or structural member.  DoD specs call for special line arresters and filters on all incoming power lines.  Finally, consider what would be required to generate this EMP.  A well placed high altitude nuclear detonation over Kansas City would affect most of the 48 states and substantial portions of Canada and Mexico.  The list of candidates to accomplish this task is short, and it flies in the face of current theories of nuclear deterrence, namely that a nation keeps these weapons in the hopes of not using them.  None of this addresses the much larger concerns of a society thrust into darkness, with power and infrastructure in ruin.

And here’s why it won’t really matter for end users in the years to come.  The best shield against EMP is actually the Earth itself.  The extents of the EMP are the sight lines to the horizon from the point of detonation, everything beyond is un-affected.  As companies migrate to the cloud, their information and processes will live redundantly in the cloud across a wide physical geography.  If Google’s American data centers went down, its European, Asian and Scandinavian centers would still run, and processes would be backed up.  This kind of thinking is not new, companies will place redundant data centers a minimum distance from each other so a singular event is not likely to take out both.  Yes, physical infrastructure would be lost, and the costs would be devastating to a facility owner, but the real value of a data center is the business processes that occur in them, and those will surely live on and survive such an attack.


Moving Beyond the Tier Rating

This is an interesting article about an emerging resiliency strategy for large scale IT operations.  If you read through the Tier guidelines from the Uptime Institute, you’ll note that for the two upper tiers (more resilient with respect to downtime) that generator plants are considered the primary power source for the building, and that all other utility feeds are just lagniappe.  Well, what happens when those utility feeds are more reliable than a generator plant?

There is a whole series of events that must occur in the proper order to ensure that from the time a utility feed is dropped and gens are brought online, IT processes are preserved.  This is a very complex process and it is why we commission data centers.  We want to be sure that these backup systems come online without a hitch.  However, there are so many parts that must work properly, there exists the real possibility of failure.  To give you an idea in basic terms, the sequence might go something like this:

1. The utility feed goes down

2. A static switch at a UPS throws over to battery or flywheel power temporarily

3. Generators are brought online

4. Some kind of switch gear switches the power over to generator from failed utility

5. Static switch at UPS switches back over to primary feed

The equipment that is installed to make this happen is very, very expensive.  The generators can easily run into the 6-figures for each set, and all of the required switchgear and UPS modules constitute a substantial part of the cost of the project.  They can also carry substantial maintenance costs.  The other factor here is that a company with redundant processes across the globe can afford to allow downtime at any given facility.  In this way, it’s a bit like a car rental business in that there is no need for insurance, because having a whole fleet of cars IS the insurance.  The most telling part of the article is the last section, where they rightly point out that this would be courting disaster for a smaller operation that is more critical to a company’s function.

In the case of the power grid across the pond, to not have an outage in nearly 30 years is nothing short of amazing!  The Facebooks and Googles of the world appear to have transcended the world of tier ratings in a big way, and now they enjoy a competitive advantage with their lower cost facilities.


Life is Imperfection

Fly in the Ointment? Meet Cricket in the Epoxy.