Software is inherently error-prone and those errors can lead to failure of an entire enterprise system. We have seen in recent months multiple “bugs” that threatened to take down the entire internet. According to software architect Roger Sessions, the primary cause of software project failures is complexity.
Network downtime is costly. An Infonetics survey of 205 medium and large businesses in North America reveals that companies are losing as much as $100 million per year to downtime. Worldwide cost of IT failure may be as much as $3 trillion! The root of the complexity problem mentioned by Sessions is not fragile software systems, rather fragile system components.
As a community of developers, we seek to develop resilient software: software that improves as the environment changes, such as a new code release with updated features. As the Society of Rugged Developers sought to define “rugged software”, it named resiliency as a part of reliability.
The graphic, above, breaks down the traits of Rugged Software, overlapping the description of antifragile software systems: multiple software applications working together to achieve the owner’s business objectives. Antifragile belongs to systems traits, and resiliency belongs to reliability in an individual software application. The individual application needs to be antifragile and the software system also needs to be resilient.
In his book Antifragile: Things That Gain From Disorder, Nassim Taleb introduces the concept of antifragility, the opposite of fragility. Fragile software breaks while antifragile software benefits from volatility.
Henrik Warne writes that software errors are rich in information, and for each bug found and fixed, the software gets a bit better. New features provide a counter-balancing force by adding bugs. The software gains quality from bugs fixed, and loses quality with the addition of new features.
Share this Post