Why does it matter?
App objectives and priorities differ from enterprise to enterprise and customer to customer, but one thing remains: everyone wants their digital applications to work. As a customer, you want to be able to do things online at a convenient time. As a business organization, or a provider of a service, you want your customers to to be able to do what they need to, when they want to do it. That requires applications that are reliable and resilient. Resiliency isn’t just a buzzword anymore. It is a key component of reliability.
The Institute of Electrical and Electronics Engineers (IEEE) Reliability Society defines reliability as “the probability of failure-free software operation for a specified period of time in a specified environment.” A reliable app functions just as the designer intended it to whenever and wherever a customer is connected. But, that doesn’t mean that every component of the app has to be absolutely flawless all of the time, which leads us to the difference between reliability and resiliency.
Reliability: The target at which software designers have always aimed: perfect operation all the time. Reliability is the planned outcome.
Resiliency: The ability of an app to recover from certain types of failure and yet remain functional from the customer perspective. Resilience is the way you achieve the outcome.
All applications have a risk of a single feature or function causing a cascading effects on its functionality or availability. For example, an update to the cloud based address book used in an app could cause it to fail, while the remainder of the app performs as designed. Resiliency in this case means building instructions into the app as follows: if the address book fails, suspend its use, then activate an alternative address book located elsewhere. That resilience is key to the app’s reliability.
Resilience in this context means that failures must be compartmentalized. One function’s failure will not cause other functions to fail. When a functionality, like the address book, is temporarily unavailable, the rest of the application still runs. After the failure has been contained, an instruction set activates, restarting the failing component. These steps need to be automatic, immediate, and reliable. When the component’s functionality has been restored, normal collaboration with other components can resume.
CabForwardSM believes strongly in writing software that doesn’t suck. We focus on resilience in our digital product planning cycle because we accept that failure is a fundamental part of the programming model. We build software to be rugged so that it can continue to operate under real-world difficult conditions. Why? If your app quits working, it is your customer that gets frustrated or upset, and that affects your livelihood. We care.