Do you Need People to Run Your Systems?

If everybody in IT left, would your software systems still run? Of course they would. Any professional IT organization strives for hands-off, lights-out operation.

In the short term, a running system should not need any human intervention. It should automatically allocate more disk space and apply routine vendor patches. If you have a variable workload, your system should auto-scale or auto-throttle. User provisioning should be automated, as should routine password resets. System privileges should automatically follow the organizational role of an individual.

In the medium term, however, an unattended system will collapse. There will be emergency security patches that need manual attention. There will be changes in APIs you depend upon.

It remains to be seen if Elon Musk has retained enough talent to stave off the medium-term collapse of Twitter. How about you? Do you have the talent you need to maintain all your systems? Or are some of them left totally unattended, waiting for an implosion?

What Sicily can teach you about IT architecture

I’m back from Sicily, and it has been Greek, Roman, Arab, Norman, French, German, and Spanish before becoming Italian. Each civilization re-used whatever was bequeathed to it by its predecessors. That’s why you find a Baroque church incorporating columns taken from a ruined Greek theater and a Palazzo built partially from lava stones taken from a Roman villa.

I remembered a quote from Ellen Ullman: “We build our computer systems the way we build our cities: over time, without a plan, on top of ruins.”

We cannot see through the layers beneath our buildings in the physical world. That makes it hard to figure out if you can safely build higher. In IT, we can see the foundation. That allows us to know what changes we can make. As long as we have the source code and the people to understand it…

Do You Know Where Your Data Is?

Facebook has no idea where they store your data. In a hearing, two senior Facebook employees admitted that they couldn’t say where user data was stored, much less ensure that it was all turned over to the authorities or deleted if required. The investigator said, “surely someone must have a diagram?” The engineers replied, “no, the code is its own documentation.”

The second law of thermodynamics applies to IT systems just like it applies to the rest of the world. It says that the amount of entropy, or disorder, inexorably increases unless someone spends energy actively trying to diminish it.

That becomes a problem when nobody spends time refactoring or cleaning up but lots of time adding new features, integrations, and dependencies. More than half of all organizations are where Facebook is: They don’t have and cannot establish the full picture of how their systems work. That places them at risk of catastrophic and irrecoverable failure. Can you establish a complete overview of your systems?

Optimization to Powerlessness

Here in Denmark, we were surprised to find that the Russians have rendered our military combat ineffective. When NATO asks what we can provide, we can offer a hundred special forces soldiers, some past-due-date antitank weapons, and an armored brigade without armor. The reason is not lack of money. We spend many millions. We just don’t spend it on things that matter.

The Russians did not have to attack us kinetically or subject us to a devastating cyber-attack to achieve this. They simply needed to infiltrate the Ministry of Defence with spreadsheet-wielding MBAs supported by a fifth column from McKinsey. We have now optimized our way to warfighting impotence.

Many organizations have similarly found that they have optimized themselves to powerlessness. A ship stuck in the Suez or a war in Ukraine will bring their entire production to a halt.

The only way to resilience, as any capable army knows, is to have extra. You have more supplies on hand than the absolute minimum, and more different suppliers than you need. You have spare warehouses and production capacity. If you let the MBAs with their spreadsheets run the business, you might suddenly find you have no business.

Think About the End at the Beginning

Your risk of getting hit by space debris just went up. The Chinese have launched the first module of their space station. Like last time, they have left their launch booster in uncontrolled orbit. Other nations plan a controlled deorbit so they can splash their used rockets in the sea. Private companies reuse them. The Chinese just lets it hit whereever.

All object have a lifecycle. In modern production, manufacturers are starting to think about how to ensure that as much as possible of products can be reused, recycled, or disposed of safely. In IT, we’re not good at thinking about end-of-life. That’s why we have decades-old mainframe systems that we can’t figure out how to get rid of.

As a CIO or CTO, next time you greenlight a new system, ask the architects and designers how they plan to decommission it. How will useful data be extracted from the system? Will historic data need to be saved? How will the business logic be extracted and reused many years into the future? The system works to spec now, but in less than a month, the system and the documentation will have diverged. Think about the end at the beginning. Don’t be like China and leave it to chance.