Nick Ewing, Managing Director, EfficiencyIT:
I was recently invited to contribute to a feature for Intelligent Data Centres, which brought together a host of data centre industry leaders to answer the question, “What are the most effective strategies for ensuring uninterrupted mission-critical operations in a data centre, particularly in the face of unexpected disruptions?”
In short, there are a host of strategies that owners and operators can undertake to ensure mission-critical reliability, especially in the face of unexpected disruption.
But just as with all data centre and distributed IT deployments, there’s no one size fits all, and at EfficiencyIT, we believe in taking a consultative approach to critical applications, addressing each customers requirement on a case-by-case basis.
However, when seeking to avoid downtime, or improve the resiliency of your infrastructure systems, there are four key areas we believe are vital for organisations to address and synonymous with most, if not all data centre applications.
1. Get visibility of your critical systems and infrastructure assets – where they’re located, what their health or operating status is currently, and how they’ve performed not only during the last twenty-four hours, but over the last three, six or twelve months.Here the three R’s – reliability, redundancy, and resiliency – are vital, and that extends from the operators’ critical power systems, all the way through to their cooling and generator equipment. A failure in just one of these places will often trigger a series of unanticipated events and if left undetected or without remedy, can have a catastrophic effect – loss of service, business-critical data, and even revenue.
2. Leverage a data centre infrastructure management software platform (DCIM) - such as Schneider Electric’s EcoStruxure IT, or Ekkosenses’ data center optimisation software, which have the ability to aggregate all your systems and data in to one platform, and thereby utilise AI to generate real-time, actionable information. Here the devil is in the data, and a software platform can be the very difference between failure and success.
3. Ensure you have a regular and robust condition-based maintenance program in place - and that you’re working with an expert engineering team to address potential issues proactively, and before they have major implications.Through new DCIM platforms, customers can share insights with their engineering and services teams securely, allowing them to address said issues – a battery or cooling failure, for example – before they cause an outage. As an Elite Data Centre and #EcoXpert Service Partner to Schneider Electric, this is one area where we at EfficiencyIT are perfectly placed to support.
4. Design your data centre for resiliency in an N+1 configuration - and to recognised standards, such as BSN5600. This will allow you to ensure greater redundancy in all your equipment – uninterruptible power supplies (UPS), power distribution units (PDUs), generators, switchgear, and cooling. Doing so will provide an essential safeguard for all your critical systems, enabling you to future-proof and minimise the impact of downtime.
Further, the additional redundancy will also allow you to plan outage scenarios in advance, and both test and turn-off equipment in a controlled manner, ensuring everything works as expected in the face of unexpected disruption.
Ultimately, when seeking to avoid an outage, prevention is far better than cure and with Uptime Institute stating that human error plays a role in about two-thirds of all outages, it’s better to be safe than sorry.
For those interested, you can read the full article here, and big thanks to Ella Hutchinson for the opportunity.