Believe the user, not the vendor

If the users say the system doesn’t work and the project sponsor says it does, believe the users. IT history is full of stories of malfunctioning systems being covered up – the most egregious case is one where 900 British postmasters were falsely convicted of theft and fraud because the Post Office’s fancy new IT system didn’t work. Look up “Horizon IT scandal” for that sad story.

Those with careers and positions to save will go to extraordinary lengths to deny any problems. The people who told the truth about the Vietnam War were the draftees who did not have a military career to protect.

What is your process for monitoring issues with the software your business is running? Do not rely on the number of tickets raised with the service desk. There is unavoidable friction involved in raising a ticket because the IT people will want screenshots and exact software versions. The average user has no clue which version of the internet browser he is using and has more important things to do. If you don’t have a simple system like the four-smiley button panels in shops and airports, you do not know if your software works for the users.

Investing and Throwing Money

There are three ways to spend money on new technology. Two good and one bad.

  • Trying it involves spending a small amount of money and time to determine if it has reached a maturity that can be useful in the organization.
  • Investing in it involves preparing a business case outlining expected business benefits and then spending a lot of money implementing it at scale in the organization.
  • Throwing Money at it is just like investing but without the business case.

I am always amazed when I see CIOs declaring that they are investing in some fancy new technology (AI these days) but failing to articulate any specific business goals when asked. That’s not investing; that’s throwing money.

A Teachable Moment

We remember stories. And the Crowdstrike-caused massive Windows outage is a good story.

If you work in Delta Airlines IT, you won’t forget this story anytime soon. As millions of passengers are stranded and separated from their luggage, you will probably see your CEO hauled in front of Congress for public shaming.

If you are responsible for some of the around 10 million Windows computers that Crowdstrike, in their incompetence, managed to bring down, you are also likely to remember.

But if you dodged the bullet this time, the whole debacle will become just another tech story in your news feed and quickly forgotten.

However, there are lessons to be learned about canary deployment, robustness against poisoned data, and undocumented software dependencies. To ensure your organization makes the most of this opportunity, have someone read the Crowdstrike Preliminary Post Incident Review and tell the story at your next department meeting. Have them tell everyone why it happened and why it couldn’t happen to you. Or why it could have happened to you, but for the grace of God.

A continually learning organization needs a way to make knowledge stick in its people’s brains. Storytelling is an excellent way to do that. Always be on the lookout for good stories.

Blocking AI is an Unwinnable Battle

Using AI is not cheating. It is a way to become more productive. You pay your employees because they perform tasks that create value for the organization. So it makes sense to let them use the best tools available to do their jobs.

Just like some schools are trying to prevent students from using AI, some companies are trying to outlaw AI. It won’t work. Research shows that 47% of people who used AI tools experienced increased job satisfaction, and 78% were more productive. You can’t fight such dramatic numbers with a blanket prohibition. If you try, your employees will use AI on their phones or in an incognito browser session while working from home.

By all means create rules about how and where employees can use AI, and explain them thoroughly. But trying to ban AI is futile.

Business Knowledge Beats Technical Skill

Most of the value of an IT developer comes from his knowledge of the business. His knowledge of specific programming languages or tools comes a distant second. With AI-supported development tools like Copilot, this value balance becomes even more skewed towards business skills.

That’s why I’m appalled every time I see yet another company replacing hundreds of skilled IT professionals. I’ll grant you that some organizations have too many people and might need to trim their headcount. But often, organizations are trying to kickstart a digital transformation by replacing old hands with a crowd of bright-eyed young things with the latest buzzwords on their CV.

Getting a new crew with better tools and techniques means you can build software faster. But by getting rid of the experienced people, you lose your ability to build the right software. Moving slowly in the right direction beats running fast in the wrong direction.

You Don’t Want a Sam Altman

You don’t want a Sam Altman in your organization. If you have, you’re not running an IT organization. You are just administering a cult.

I’m all for having brilliant and charismatic performers in the organization. However, having individuals perceived internally and externally as indispensable is not good. Mr. Altman admitted as much back in June when he said, “No one person should be trusted here. The board can fire me, I think that’s important.”

It turns out that the board couldn’t fire him. He had carefully maneuvered himself into a position where investors and almost everyone on the team believed that OpenAI would come crashing down around their ears if he left, costing them the billions of dollars of profit and stock options they were looking forward to.

Make a short list of your organization’s 3-5 star performers. For each of them, ask yourself what would happen if they were let go or decided to leave. If any of them are in a Sam Altman-like position, you have a serious risk to mitigate.

On-premise culture

The boss wants you back in the office. He has a point.

The point is that unless your organization was born fully remote, it is stuck with an on-premise culture. You can try to fight it. But remember what happened the last time a new strategy initiative was launched? Your organizational culture completely dominated the new ideas until you did things the way you had always done them. That is what management guru Peter Drucker meant when he said that “culture eats strategy for breakfast.”

In an on-premise culture, relationships are built through in-person interactions. The exciting projects, the conference trips, and the promotions go to the people seen in the organization. You can argue that’s not fair, but all the leaders in your organization grew up in an on-premise culture.

In an on-premise culture, new ideas germinate from chance encounters. The two Nobel Prize winners in medicine this year met at the copy machine. Both were frustrated that nobody took their ideas about mRNA seriously. They started working together, and their work enabled the coronavirus vaccine.

The fully remote organization is a technologically enabled deviation from how humans have organized themselves for thousands of years. Building the culture that makes such an organization work takes precise and conscious decisions. That goes into its DNA from the founding. You cannot retrofit fully remote onto an on-premise culture.

The ROI on AI Projects is Still Negative

Unless you are Microsoft, your IT solutions are expected to provide a positive return on the investment. You might have heard that Microsoft loses $20 a month for every GitHub Copilot customer. That’s after the customer pays $10 for the product. If you are a heavy user of Copilot, you might be causing Microsoft a loss of up to $80 every month.

Some organizations are rich enough to be able to afford unprofitable products like this. They typically have to spend their own money. VCs seem to have soured on the idea that “we lose money on every customer, but we make up for it in volume.”

If you are running an AI project right now, you should be clear that it will not pay for itself. Outside a very narrow range of applications, typically image recognition, AI is still experimental. If you have approved an AI project based on a business case showing a positive ROI, question the assumptions behind it. The AI failures are piling up, and even the largest, best-run, and most experienced organizations in the world cannot make money implementing AI yet. You probably can’t, either. Unless you have money to burn, let someone else figure out how to get AI to pay for itself.

The Guard Rail Pattern

There is a simple way to prevent many IT disasters, and it is sadly underused. It’s not on the standard lists of design patterns, but I call it the “Guard Rail” pattern.

It would have prevented the IT disaster that dominates the news cycle in Denmark these days. Techno-optimists have forced a new digital building valuation on the long-suffering Danes, and it is an unmitigated catastrophe. The point is to replace the professional appraisers who determine the value of a property for tax purposes with a computer system. And many of the results from the computer are way off. Implementing a Guard Rail pattern would mean that the output from the new system would be compared to the old one, and those valuations that are, for example, 3x higher would be stopped and manually processed.

A colleague just shared a video of the latest iteration of the Tesla Full Self Driving mode. This version seems to be fully based on Machine Learning. Previous versions used ML to detect objects and traditional algorithmic programming to determine how to drive. As always infatuated with his own cleverness, Elon Musk does not seem to think that guard rails are necessary. Never mind that the FSD Tesla would have run a red light had the driver not stopped it. Implementing the Guard Rail pattern would mean that a completely separate system gets to evaluate the output from the ML driver before it gets passed to the steering, accelerator, and brakes.

When I attach a computer to my (traditional) car to read the log, I can see many “unreasonable value from sensor” warnings. This indicates that traditional car manufacturers are implementing the Guard Rail pattern, doing a reasonableness check on inputs before it passes the values to the adaptive cruise control, lane assist, and other systems. But the Boeing 737 MAX8 flight control software was missing a crucial Guard Rail, allowing the computer to override the pilot and fly two aircraft into the ground.

In your IT organization, discuss where it makes sense to implement the Guard Rail pattern. Your experienced developers can probably remember several examples where Guard Rails would have saved you from embarrassing failures. There is no need to keep making these mistakes when there is an easy fix.

Would You Notice the Quality of Your AI Dropping?

You know that ChatGPT is getting more politically correct. But did you know that it is also getting dumber? Researchers have repeatedly been asking it to do tasks like generating code to solve math problems. In March, ChatGPT 4 could generate functioning code 50% of the time. By June, that ability had dropped to 10%. If you’re not paying, you are stuck with ChatGPT 3.5. This version managed 20% correct code in March but was down to approximately zero ability in June 2023.

This phenomenon is known to AI researchers as “drift.” It happens when you don’t like the answers the machine gives, and take the shortcut of tweaking the parameters instead of expensively re-training your model on a more appropriate data set. Twisting the arm of an AI to generate more socially acceptable answers has been proven to have unpredictable and sometimes negative consequences.

If you are using any AI-based services, do you know what the engine behind the solution is? If you ask, and your vendor is willing to tell you, you will find that most SaaS AI solutions today are running ChatGPT with a thin veneer of fine-tuning. Unless you continually test your AI solution with a suite of standard tests, you will never notice that the quality of your AI solution has gone down the drain because OpenAI engineers are pursuing the goal of not offending anyone.