DZone

What does it mean to optimise for resilience? Why is resilience so valuable to an organization, and how can operability contribute towards it? In this article, Steve Smith explains what optimising for resilience is, and why it is so valuable to IT delivery. This is part of the Resilience As A Continuous Delivery Enabler series:

  1. The Cost And Theatre Of Optimising For Robustness
  2. When Optimising For Robustness Fails
  3. The Value Of Optimising For Resilience
  4. Resilience As A Continuous Delivery Enabler – TBA

Resilience Is Graceful Extensibility

When an organisation wants to improve the reliability of its IT services, it should optimize for resilience. Resilience is the ability to " absorb or avoid damage without suffering complete failure," and it is achieved by minimising the Mean Time To Repair (MTTR) of services. Some classes of failure should never occur, some failures are more costly than others, and some safety-critical systems should never have failures, but in general, organizations should adhere to John Allspaw’s advice that " being able to recover quickly from failure is more important than having failures less often."

Source: DZone