Coinbase just spent 16 million dollars on a 30-second Superbowl ad. It seems like the ad worked, because their website was promptly overwhelmed with traffic and crashed. Maybe they should have spent a bit on resilient network infrastructure as well.
The problem with many of the IT infrastructures I see is that they are brittle. Each component can be resilient with load balancing and database failover without making the overall system robust. Reliability engineering is a cross-domain discipline, and it is not enough that each team builds robustness into their little piece of the total landscape. Who is responsible for the overall reliability of your systems?