Don’t Ask Half Questions

Asking half questions leads to dangerous outcomes. We just saw an example when irresponsible Reuters pollsters looking for a scoop simply asked Americans “should NATO establish a no-fly zone over Ukraine.” They got a resounding 74% approval.

Another pollster asked the question with the qualifier “knowing that this will lead to direct war with Russia” and support dropped to 34%.

A complete question asks “are you willing to accept this downside to gain this upside?” Organizations get an idea, focus on the upside, take a cursory glance at the downside, and then take erroneous or even disastrous decisions. Who has the job of ensuring the downside is examined as well as the upside? You might need someone external to provide this.

There is Always an Alternative

There is always an alternative. Not looking for it is either intellectual laziness or willful manipulation. Margeret Thatcher, Prime Minister of the UK for a decade, was known among friends and enemies alike as “TINA” due to her usual insistence that “There Is No Alternative.”

As an IT leader, you are bombarded with requests to make specific technical decisions. Many of these are attempts to railroad you into choosing a technology that the team would like to play with and put on their CVs. When presented with a single option, ask for more. When one of the options is the obvious slam dunk, examine what has been left out of the presentation of the others. Binary selections are common in computer programming. In the real world, there are always many choices.

Are you Monitoring Important Systems?

New York is replacing their payphones with LinkNYC access points providing free calls, 911 calls, free WiFi, charging, and more. You would think such a system would warrant professional monitoring. Nevertheless, some of these devices just show a blue screen of error messages followed by a Linux login prompt.

  • Monitoring of crucial systems must include an automated mitigation action and reporting to a 24/7 operations center.
  • Monitoring of important systems needs immediate alerting to staff on call.
  • Monitoring of normal systems only needs to log a trouble ticket to be addressed by regular staff during working hours.
  • Low-priority systems do not need active monitoring.

It seems these kiosks are not as important to the company running the system as they were to the Mayor promising them.

Does every system on your central system list have a monitoring priority? When was the last time you checked with the person with the technical responsibility what monitoring is in place?

Fancy or Usable?

Do you want something that works or something that looks fancy? Sometimes, these two objectives come into conflict. Too often, the IT professionals can’t imagine a solution that does not involve touchscreens and mobile apps.

I’m staying in an upscale hotel in New York this week, and the control panel for heating and lighting is definitely old-school. But it works. And it can be understood and operated by every age group likely to frequent the hotel.

Meanwhile, back in Denmark, we are currently rolling out a new central authentication system. You will have to figure it out in order to do online banking or access public services. It was designed by tech-savvy young people and is very fancy. Too bad it has left hundreds of thousands of non-computer-literate citizens desperately calling the understaffed phone helpline.

Are you sure the solutions you roll out have been tested by the entire target audience?

Check your defenses

Your risk profile just changed dramatically. You might think the war in Ukraine will not affect you, but your risk is higher than you think.

Do you know who ultimately writes the code your vendor delivers? Your contract is with a large system integrator in your own country. They outsource actual coding to several subcontractors, who sub-subcontract until the actual code is written by a team of three people in a basement in Kyiv. And right now, an adversary with nation-state resources is out to destroy the Ukrainian software industry along with the rest of the country.

Remember the attack that hit Maersk Lines a few years ago? They are the world’s largest container shipping company and have strong cyber defenses. Nevertheless, they suffered a two-week outage and lost $300 million because an attack on their Ukrainian subsidiary got through their defenses.

Revisit your risk management plan. You need stronger network security towards your all your suppliers.

Shooting the messenger

Even though the clueless Governor of Missouri tried to shoot the messenger, he missed. Last year, a reporter published his findings that private data on more than 100,000 teachers was available to anyone who knew how to click “View Source” on a web page. The Governor held a widely-ridiculed press conference where he vowed to prosecute the “hackers” who had told the world about the incompetence of the state IT department.

A thorough report by law enforcement now roundly exonerates the journalist. It also exposes that personal information on more than half a million people had been available for a decade to anyone who care to look.

Even professional IT organizations occasionally fail like the state of Missouri did here. You have a little simple system, you are under schedule pressure, and you forgot to book time with the security team. So you roll it out without a security review. The antidote to this is to maintain a complete systems inventory with a field for the name and email of the person who did the security review. That will show you if this step got skipped, and allow you to quickly ask questions about any alleged security issues before you start shooting at the messenger.

User Blaming

The IT industry has its own version of victim blaming. I call it user blaming. That is what happens when you build an IT system without proper regard for the users’ reality. When the purported benefits do not materialize, the vendor points to the convoluted and impractical instructions given and claims that if only the users would follow the instructions, the system would work as advertised.

I was reminded of user blaming this weekend. I had worn out the burrs on my coffee grinder, and as is sadly often the case, a replacement part was more expensive than a new machine. Being a professional, I always read the instructions. They told me to clean the machine after each use. Since I only grind what I need, that would mean several cleanings a day. And the cleaning involved six steps, washing everything in lukewarm water, emptying out the beans, disassembling the grinder, cleaning the burrs with the supplied cleaning brush, and much more.

That is an abdication of responsibility. Just like when an IT vendor provides unrealistic and impossible-to-follow CYA instructions. Take responsibility. Build a quality product that works in real life.

We are Still Building our IT on Shaky Ground

Once again, researchers have demonstrated the shaky foundation under our IT infrastructure. A lot of modern code is being built with Node.js, making use of Node Package Manager (npm) to pull in libraries that your code depends on.

There is no evidence that evil Russian hackers built npm. But if it didn’t exist, it would be a priority for the cyber-warfare command of our adversaries to build something like it and tempt us to use it.

The problem is that it is easy to use npm wrong, and hard to use it right. We’ve already seen many cases where organizations simply pull the latest packages from npm when they build. That means that as soon as the central package in the npm repository is corrupted or taken down, that failure will ripple through many layers of code that depend on it.

The latest discovery is that 8,000 packages have maintainers with an expired email domain. That allows any hacker to purchase that domain, re-create the maintainer email and take over the package.

Ask your CISO or other security function how your IT organization makes sure that every project only pulls npm packages from your official repository of security-vetted packages.

Reliability Engineering

Coinbase just spent 16 million dollars on a 30-second Superbowl ad. It seems like the ad worked, because their website was promptly overwhelmed with traffic and crashed. Maybe they should have spent a bit on resilient network infrastructure as well.

The problem with many of the IT infrastructures I see is that they are brittle. Each component can be resilient with load balancing and database failover without making the overall system robust. Reliability engineering is a cross-domain discipline, and it is not enough that each team builds robustness into their little piece of the total landscape. Who is responsible for the overall reliability of your systems?

The Horror of not Testing

In the classic 1983 John Carpenter horror movie “Christine,” the radio on the possessed 1958 Plymouth Fury can only play old rock and roll stations. Owners of 2016 Mazdas in Washington State now have the same experience. They don’t even get rock’n’roll but are instead forced to endure NPR.

Their cars are not possessed by evil spirits but suffer from a software bug. It turns out that the local NPR station sent out “now playing” album images without a .jpg extension. That was enough to send the radio and navigation unit into an endless loop, making it impossible to use navigation or Bluetooth – or change the station. Embarrassed, Mazda is offering a free replacement of the $1,500 connectivity master unit.

This incident illustrates the dangers of casual testing. A professional tester would have sent the unit all kinds of corrupted or misnamed files, files with zero length, and very large files. That would have uncovered the bug. Do you have testing professionals on your teams? If you let developers test their own software, you’ll end up where Mazda is – or worse.