Backup Communication Channels

What is the difference between 30 individual soldiers and a platoon? Leadership and the ability to communicate.

The first step in your resilience planning is to ensure that you can still communicate, even when faced with an onslaught of Russian hackers or American government officials.

That could mean an on-premise open source mail server and a basic web server. Every workstation and company smartphone could have a separate open source mail client and web browser preconfigured for those servers.

There are many other options – the paranoid and those with high threat levels might have spare phones running GrapheneOS and Briar, or even establish their own Meshtastic mesh network.

If you don’t have a backup communication channel, you urgently need to establish one. Especially if you are outside the U.S. and depend on U.S. services.

Rational Decision-making

As an individual, you are free to make emotional decisions. You can decide to evict some software product from your laptop because you don’t like the vendor’s nationality or stance on today’s hot-button social issue. As an IT professional, you can even set up an open source solution that does almost the same (though invariably with worse UX) in a few days.

But as an IT leader, you are expected to make rational decisions. That’s why you don’t throw out all your Amazon, Microsoft, or Google Cloud on a whim because you are unhappy with U.S. policy. The rational choice is to minimize your risk. That means building new systems outside U.S. clouds so you don’t add to your problems. And migrating away from disfavoured platforms in an orderly, cost-effective manner.

Sovereign Cloud

You need to create a delay between a foreign government ordering your cloud provider to cut you off and the actual cutoff. The longer you can make this period, the better. The new AWS European Sovereign Cloud (ESC) is Amazon’s way of offering this. That is a cloud solution running on hardware in Europe, staffed by Europeans, organized under a European daughter company.

That does not protect you against Amazon being compelled by the U.S. government to hand over your data, but all important data should be protected with keys you hold, outside your cloud provider, anyway. But it does make it likely that AWS Europe would contest an order to shut down the service, and that AWS Europe employees would not cut you off at the whim of a foreign dictator.

Because the probability of this happening is still “Rare” (edging towards “Unlikely”), you do not need to act on this risk now. But it is prudent to ensure you have time to react if it should happen.

Risk Evaluation

Do you have a paper map in your car? No, why would I need that?

If you are a Verizon customer in the U.S., you were just reminded. A large chunk of their network was down for half a day, leaving frustrated customers depending on their atrophied geographical memory. Verizon says the culprit was the usual botched network upgrade, not evil hackers. Some Europeans are better prepared, having routinely been subjected to Russian jamming of GPS and Galileo navigation.

When was the last time you revisited the risk evaluation of your critical systems? The threats are changing and increasing, and your risk evaluation from one year ago no longer applies.

Downside Thinking

What is the downside? That is the critical question for any kind of decision. Someone has an idea, and they will present the upside. We can save so much money, develop faster, offer better customer service, etc., etc. It is your job as a leader and decision-maker to ferret out the downside.

Lack of downside consideration is behind many questionable business decisions. Twitter seems to offer a new example of Elon’s lack of downside thinking every week. Like “Let’s offer a feature that changes any photo to show the person in a bikini.” Normal people, doing even the most minimal downside consideration, would kill that idea in seconds. But Twitter/X rolled it out – and obviously had to roll it back.

You can do the downside thinking yourself for many decisions. For more complicated scenarios, you might need a dedicated Red Team or outside help to identify the downside. But before any decision, ask yourself: Have we considered the downside?

Digital Sovereignty

You need to think about Digital Sovereignty. Unless you are in the U.S., of course. For everybody else, this is a very salient topic. Especially for us in Denmark these days.

This doesn’t mean that you have to free yourself from every American cloud provider. But it does mean there is a new item in your risk evaluation: Ending up on the Office of Foreign Assets Control (OFAC) blocklist.

Likelihood is Rare (1) for almost everybody. But if Impact is Catastrophic (5), you end up with a medium risk: Mitigate if cost-effective.

Switching costs almost always make it not cost-effective to transition a running system. But when you are building anything new, you don’t have switching costs. And an effective mitigation is to avoid using U.S. providers.

Rollback plans

What differentiates a professional software organization from a bunch of amateurs? One thing is the ability to roll back.

It’s not a good day at the office when the Federal Aviation Authority issues an Emergency Airworthiness Directive, grounding 6,000 of the aircraft your customers are flying. A JetBlue flight on autopilot suddenly turned nose-down in mid-flight, and it turns out that the L104 version of the ELAC software was vulnerable to memory corruption due to solar radiation.

But aircraft manufacturers and airlines have procedures in place, and engineers rapidly fanned out to airports around the world, rolling back to the L103 version.

It is impossible to test every situation, and every once in a while, something unforeseen happens. Professional software organizations can rapidly recover. Do you have rollback plans in place every time you roll out new versions of critical software?

What are the Essential Dependencies of your Critical Systems?

You have two kinds of systems: Those where you can wait for Cloudflare, Amazon, or Microsoft to come back up, and those where you can’t.

For critical systems, your developers and operations people must carefully examine the code and document all dependencies. Once you know what you depend on, you can decide whether to build in automatic mitigation or establish a limited-functionality mode.

To concentrate your efforts in the right place, your systems list must identify the truly critical systems and their dependencies. Does it?

Avoiding Project Failure, the Frank Gehry Way

Projects by famous architect Frank Gehry are always completed on time and on budget. That’s not because he only does small and easy things – for example, he designed the Guggenheim Museum Bilbao.

But what he does do is prepare carefully. It might take several years for Mr. Gehry to plan, build scale models, and solve engineering challenges. That all happens cheaply before the construction team moves in with thousands of people and heavy machinery. Sometimes, this preparation means a project is not done. That’s because Gehry will discover in advance that the project as envisioned cannot be completed with the time and budget available.

We’ve just wasted $10 million of taxpayer money for several years in a row here in Denmark because nobody here works like Frank Gehry. The politicians decided to allocate money for “AI signature projects,” and nothing came of them in 2020. So, they allocated another $10 million in 2021. Same result. In 2022, another $10 million was wasted.

The money would not have been wasted if these projects generated new knowledge. But they didn’t. They spent money on data scientists and programmers only to discover afterward either that they did not have the data they needed to train their AIs or that their use of AI violated existing legislation and citizens’ rights.

That could have been discovered cheaply before the programmers started coding. But everybody wanted to run the project. When you are considering a project in your organization, especially in a fashionable technology like AI, you need an independent outsider to review your business case. That’s one of the things I do for my customers. Get in touch to hear more.

You Don’t Want a Sam Altman

You don’t want a Sam Altman in your organization. If you have, you’re not running an IT organization. You are just administering a cult.

I’m all for having brilliant and charismatic performers in the organization. However, having individuals perceived internally and externally as indispensable is not good. Mr. Altman admitted as much back in June when he said, “No one person should be trusted here. The board can fire me, I think that’s important.”

It turns out that the board couldn’t fire him. He had carefully maneuvered himself into a position where investors and almost everyone on the team believed that OpenAI would come crashing down around their ears if he left, costing them the billions of dollars of profit and stock options they were looking forward to.

Make a short list of your organization’s 3-5 star performers. For each of them, ask yourself what would happen if they were let go or decided to leave. If any of them are in a Sam Altman-like position, you have a serious risk to mitigate.