The Guard Rail Pattern

There is a simple way to prevent many IT disasters, and it is sadly underused. It’s not on the standard lists of design patterns, but I call it the “Guard Rail” pattern.

It would have prevented the IT disaster that dominates the news cycle in Denmark these days. Techno-optimists have forced a new digital building valuation on the long-suffering Danes, and it is an unmitigated catastrophe. The point is to replace the professional appraisers who determine the value of a property for tax purposes with a computer system. And many of the results from the computer are way off. Implementing a Guard Rail pattern would mean that the output from the new system would be compared to the old one, and those valuations that are, for example, 3x higher would be stopped and manually processed.

A colleague just shared a video of the latest iteration of the Tesla Full Self Driving mode. This version seems to be fully based on Machine Learning. Previous versions used ML to detect objects and traditional algorithmic programming to determine how to drive. As always infatuated with his own cleverness, Elon Musk does not seem to think that guard rails are necessary. Never mind that the FSD Tesla would have run a red light had the driver not stopped it. Implementing the Guard Rail pattern would mean that a completely separate system gets to evaluate the output from the ML driver before it gets passed to the steering, accelerator, and brakes.

When I attach a computer to my (traditional) car to read the log, I can see many “unreasonable value from sensor” warnings. This indicates that traditional car manufacturers are implementing the Guard Rail pattern, doing a reasonableness check on inputs before it passes the values to the adaptive cruise control, lane assist, and other systems. But the Boeing 737 MAX8 flight control software was missing a crucial Guard Rail, allowing the computer to override the pilot and fly two aircraft into the ground.

In your IT organization, discuss where it makes sense to implement the Guard Rail pattern. Your experienced developers can probably remember several examples where Guard Rails would have saved you from embarrassing failures. There is no need to keep making these mistakes when there is an easy fix.

Would You Notice the Quality of Your AI Dropping?

You know that ChatGPT is getting more politically correct. But did you know that it is also getting dumber? Researchers have repeatedly been asking it to do tasks like generating code to solve math problems. In March, ChatGPT 4 could generate functioning code 50% of the time. By June, that ability had dropped to 10%. If you’re not paying, you are stuck with ChatGPT 3.5. This version managed 20% correct code in March but was down to approximately zero ability in June 2023.

This phenomenon is known to AI researchers as “drift.” It happens when you don’t like the answers the machine gives, and take the shortcut of tweaking the parameters instead of expensively re-training your model on a more appropriate data set. Twisting the arm of an AI to generate more socially acceptable answers has been proven to have unpredictable and sometimes negative consequences.

If you are using any AI-based services, do you know what the engine behind the solution is? If you ask, and your vendor is willing to tell you, you will find that most SaaS AI solutions today are running ChatGPT with a thin veneer of fine-tuning. Unless you continually test your AI solution with a suite of standard tests, you will never notice that the quality of your AI solution has gone down the drain because OpenAI engineers are pursuing the goal of not offending anyone.

Offering Alternatives

Are you building critical software? Then you know to offer a fallback option if something – despite all your testing – does not work. That is often not a concern in organizations that can simply force users to suffer their app. Like the public sector in Denmark, where every parent of a schoolchild in Denmark must use the “Aula” app. Unfortunately, a botched upgrade means that many cannot log in.

Having only smartphone apps makes you vulnerable. The app stores do not older versions, so once you have rolled out a defective version, you (and your users) are up the creek. The mitigation for this risk is to also offer a responsive web application with only the most crucial features.

Take a look at the smartphone apps your organization offers to its customers. Are any of them critical? If so, do you have an alternative ready?

In Praise of (Useful) Managers

You do need some managers. Elon Musk is trying to prove that Twitter can be run with only himself and the people who write code, and it’s not going well. It turns out that it takes a little more to run an organization than just coding and tweeting.

For example, Elon had announced that only enterprise customers who would pay $$$ would have access to the API. But he had fired everyone who was able to process an application for an enterprise license. So when the last overworked API engineer committed the change that implemented the limit, there were no paying customers because there was nobody to take the money of the few tool vendors willing to pay up.

Your overhead grows inexorably. Unless you pay very close attention, the fraction of total headcount actually writing code goes lower and lower. To avoid ending up having to take a chainsaw to your organization as Elon has done, calculate your coder percentage today and keep track of it.

Once you Grow up, you Need to Stop Moving Fast and Breaking Things

Moving fast and breaking things can be fine for a startup. They might need to iterate several times and maybe even pivot once or twice before they achieve product/market fit. It is not OK for an established business. Facebook has long since given up on this strategy, but Twitter, under Elon Musk, has rediscovered it. By thrashing around and changing direction daily, they are alienating both the users and the advertisers who were supposed to pay. If you want to move fast, roll out changes to a small percentage of your users. A mature continuous delivery organization practices blue/green deployment, but even if you are not doing CI/CD, you can still test changes with a small subset of your users. Don’t uncritically inflict the latest great idea on your entire user population. #itleadership #innovation #makeitliveuptoitspromise

The Wrong Way to Use Open Source

The glass is less than half full, and that is worrying. One of the focus areas in the 2022 State of Data Science report was enterprise adoption of open source. On every important metric, less than half the respondents gave the answer I was hoping for. For example, “We use a managed repository” got 43%, and “We use a vulnerability scanner” got 36%.

With such a low security maturity, it is obvious that the Log4j debacle and recent hacktivism has dented open source adoption. 40% report scaling back the use of open source in the past year.

Open Source gives you transparency that proprietary software doesn’t. That means you have the ability to verify that it works as promised instead of simply trusting vendor promises. But if you don’t make use of this transparency and instead simply download something because it’s free, you are setting yourself up for problems. Is your organization found in the more than half-empty part of the glass?

You Need an Agile End Result More Than an Agile Process

An agile development process is not important. An agile end result is. If your organization realizes a benefit from an agile methodology, that only helps you during the relatively short development process. But if you build something that can easily be re-configured and changed, that benefits you for the year or decades that you are running the system.

You would think that a digital billboard would be agile. The whole point is that it can display whatever you want. But German advertisers and shops have just realized that their display screens have very narrow agility. A new law requires these energy-guzzling billboards to be switched off at night to save electricity. It turns out that all the devices were built on the assumption that they would always be on, and they do not take kindly to store employees simply yanking the power cord when they go home.

To achieve agility in your products and systems, you need to avoid hard-wiring your assumptions into them. The only thing you can safely assume is that everything will eventually have to be changed.

What do you incentivize?

You get what you reward. Twitter decided to primarily reward user growth, and things went downhill from there. Recently, their head of security quit. Then he filed a whistleblower complaint with the authorities, complaining that Twitter’s security is bad and not getting better. Now there is likely to be a very interesting congressional hearing.

White-hat hacker Peiter Zatko (aka “Mudge”) was hired after a 2020 security breach but could not implement the changes he felt were necessary to fight spam and automated bots. The reason is the incentive structure at Twitter.

You see, Twitter management bonuses are based on user growth. There is no bonus for reducing spam or automated bots. You get what you reward, with no exception. You can reward with money, perks, promotions, or other recognition, but you have to incentivize the behavior you want. If all your incentives are based on quantity and not quality, you will get ever-increasing quantity and ever-decreasing quality.

Do You Want Fancy, Cheap, or Usable?

Do you prefer fancy, cheap, or useful? Car manufacturers have found that touchscreens look fancy and are cheaper than physical buttons. That’s why modern cars have touchscreens and no buttons. I prefer buttons, and science is on my side. A survey shows that it is almost twice as fast to perform common car tasks with buttons than with a touchscreen.

Unfortunately, a car is not required to be easy to operate. Aircraft cockpits, on the other hand, are full of physical buttons. An aircraft manufacturer accepts the extra cost of physical controls to provide an optimal user experience for the pilot.

Marketing wants fancy. Finance wants cheap. User Experience wants usability. Who has the last word on product design in your organization?

The Tolstoy Principle in Action

This is what failure looks like: 50% one-star reviews. The other half is five-star reviews. Assuming these are not all from the app developers themselves, the app apparently can work. It just didn’t work for me, nor for many others.

I call this the Tolstoy principle: All successful apps are alike; each unsuccessful app is unsuccessful in its own way. The end-user does not care that 98% of your back-end infrastructure is running. They care that they can complete their task. And if one critical component fails, your app is a failure. Like this one from my local supermarket chain.

When you build systems, is all the attention lavished on a cool front-end app? Unsexy back-end services are equally important.