cns me / blog Cloudy with a Chance of Freefall
← Index | | 11 min read

Make the Decision at 2pm

he decision that determines your incident response was not made at 4am. It was made — or not made — weeks before.

On the evening of 26 September 1983, Lieutenant Colonel Stanislav Petrov sat down for a night shift at a Soviet nuclear early-warning facility called Serpukhov-15, south of Moscow. The system was new. The satellites were Soviet. The doctrine was clear: if the computers detected an incoming American missile, Petrov was to phone his superiors immediately. The chain of command would decide what happened next.

At 00:15, the alarm sounded.

The screen said: one inbound missile. Then five. The confidence indicator read "highest." The doctrine told Petrov exactly what to do. Escalate. Let the generals decide. The generals who had never had to decide anything like this at midnight, under sirens, with twelve minutes until impact and the weight of mutual assured destruction pressing on every second of delay.

Petrov hesitated. He ran the logic: a genuine American first strike would not arrive as five missiles. It would come as hundreds. The satellites were new and possibly glitching. He had no pre-agreed framework for this specific scenario — only a process that said "escalate," and his own unassisted judgement that something was wrong.

He was right. It was a malfunction. The world did not end that morning.

But I want to be careful about the lesson. Stanislav Petrov's individual genius at midnight saved hundreds of millions of lives. That is not a model for institutional design. It is an indictment of one.

Article content
AI Generated: The night Stanislav Petrov relied on instinct where doctrine had failed him

I think about Petrov every time I walk into a Security Operations Centre.

Not because of the Cold War stakes. Because of the decision architecture. Here was a man asked to make the most consequential judgement of his life, at midnight, at the end of a shift, under conditions precisely calibrated to degrade human performance, with no pre-determined framework for what to do. The Soviet doctrine gave him a process — escalate — but no pre-authorised response for the scenario where escalation itself was the catastrophe. The decision that needed to have been made in a Moscow conference room at 2pm had never been made at all. So it fell to one man, alone, in the dark.

In security operations centres in 2026, we do this to our people every day.

Then Like Now

The Roman legions solved this problem two thousand years ago. They called it imperium — the delegated authority to act without consulting Rome. A legate commanding forces on the Rhine did not wait four hundred miles for Caesar's instruction before responding to a threat. The authority to make certain decisions was granted in advance, in Rome, when the Senate was assembled and the maps were on the table and the stakes could be examined without enemy cavalry on the horizon. Trigger conditions were pre-agreed. Actions were pre-authorised. What remained for the commander in the field was execution, not deliberation.

Dwight Eisenhower understood this instinctively. Before the landings in Normandy, he pre-delegated authority across the command structure so that field commanders could act without waiting for Supreme Headquarters. He wrote the order accepting personal responsibility for the invasion before the boats left harbour — and drafted a second message, accepting failure, folded into his wallet and never sent. The decision about what to do when things went wrong had been made before the guns fired. Not because Eisenhower was a fatalist. Because he understood that no human being makes their best decisions at 4am in the English Channel with anti-aircraft fire overhead.

Then like now.

The Security Operations Centre in 2026 is not Omaha Beach. But the cognitive conditions are less different than we would like to believe. The average SOC analyst handles roughly 4,484 alerts in a single working day. Sixty-three percent of those alerts are false positives — noise that must be processed before the signal can be found. Seventy-one percent of analysts report burnout at clinical levels. They rotate onto nights. They cover weekends. And then — because someone somewhere decided that threat actors observe office hours — we hand the containment decision to an analyst on their fourth consecutive night shift.

The research on shift work and cognitive performance is not ambiguous. Night shift error rates run 28% higher than day shifts. On a fourth consecutive night, the figure reaches 36%. These are not marginal differences. They are the difference between catching lateral movement in your estate during its first hour and discovering the breach three weeks later when the exfiltration is already complete.

And yet. In most organisations I encounter, the containment decision — whether to pull a compromised host off the network, trigger an account lock, or isolate a segment — still requires a phone call to someone who may be asleep. Someone who will need ninety seconds to understand what they are being asked. Who will ask clarifying questions that should not need asking at this moment. Who will be, in the precise clinical sense, cognitively impaired.

Your adversaries are not calling anyone for permission.

Article content
Where decisions should be made, and where they too often end up

The Sullenberger Principle

On 15 January 2009, Captain Chesley Sullenberger lost both engines of US Airways Flight 1549 at 2,818 feet above Manhattan after a flock of Canada geese destroyed them climbing out of LaGuardia. He had, roughly, three and a half minutes. No time for a committee. No escalation to a chain of command. What he had was the emergency checklist — a document whose existence reflected thousands of decisions made by engineers, safety officers, and airline operations teams on ordinary afternoons, long before anyone was in trouble.

He worked through it. Then he deviated from it.

That last part is the part that matters. Sullenberger assessed, midway through the engine-restart sequence, that completing the checklist would consume altitude he could not afford to lose. He substituted his own judgement for the written procedure. He ditched in the Hudson. Everyone survived. That deviation was not a failure of the checklist. It was the checklist working exactly as designed — as a foundation, not a cage.

Gary Klein, the cognitive psychologist who spent decades studying expert decision-making in operating theatres, on fire grounds, and in military operations, described this pattern in his recognition-primed decision model. Experts under pressure do not generate multiple options and select the best one. They recognise a situation as belonging to a type — one they have encountered, trained for, and pre-deliberated about — then adapt their pre-existing response to the specific conditions in front of them. The quality of that in-the-moment adaptation depends entirely on the quality of the preparation before the crisis began.

Sullenberger could deviate from the checklist because he had internalised it so completely that he could identify precisely where it was wrong for his specific situation. The checklist was not a constraint. It was what made intelligent deviation possible.

This is the misunderstanding I encounter most often when I argue for pre-determined actions in a SOC. Leaders hear "pre-determined" and imagine rigidity — a linear script applied to a nonlinear incident. The objection is not stupid. No real incident unfolds the way the playbook assumed. A script that cannot accommodate emerging reality creates false confidence and real paralysis.

But pre-determination is not scripting. It is threshold authorisation. The distinction is everything.

Here is how I think about placing any response action into the right tier. Three criteria determine where it belongs. Reversibility: can this action be undone, and how quickly? Isolating a host can be reversed in minutes. Wiping a drive cannot. Confidence threshold: at what detection confidence level does the action fire? Some signals are unambiguous; others require corroboration. Blast radius: how many systems or services does this action affect? An account lock touches one user. A network segment isolation might touch five hundred.

Those three questions are what your senior people should be working through on a Tuesday afternoon — not the playbook syntax, not the vendor documentation, but the principle. Answer them honestly and the tier structure follows. Defer them and you have only relocated the decision to midnight, where it will be answered badly.

Technologies change. Human nature does not. The temptation to leave those decisions implicit — deferred, subject to judgement in the moment — is as old as every institution that has ever faced an emergency it was not ready for.

The Decision That Needed to Happen at 2pm

So the framework exists: reversibility, confidence, blast radius. What does it look like when an institution actually applies those criteria in advance?

The paramedic does not phone the hospital to ask whether to administer adrenaline during a cardiac arrest. The standing order was written in a room with consultants, pharmacologists, and ethicists who were not standing over a dying person. The paramedic acts. The decision about when to act, under what conditions, with what drug, at what dose, was made at 2pm on a Tuesday. The patient benefits from it at 3am on a Wednesday.

The fire service calls this Pre-Determined Attendance. When a call comes in for a tower block fire, the PDA specifies how many pumps, aerial platforms, and officers are dispatched — before anyone has seen the building. Resource commitment was decided by people who could think clearly, examine data, and accept accountability without a clock running. The Grenfell Tower inquiry found, among many failures, that the PDA for high-rise fires was inadequate. The failure was not in the response. It was in the preparation.

Security operations already have the technical means to implement the equivalent. Modern SOAR platforms allow organisations to categorise response actions by tier. CrowdStrike's operational model distinguishes between actions that execute automatically when conditions are met, actions that require human approval before execution, and actions that must never be automated regardless of circumstances. Green, orange, red. It is a sensible implementation — and it maps directly onto the three criteria above. Green actions are high-confidence, high-reversibility, low blast radius. Red actions fail one or more of those tests decisively. The framework does not originate with any vendor. It is the same logic that aviation, medicine, and the Roman legions independently arrived at.

Not at 4am.

This is not a theoretical argument. The cost of failing to decide is measured, repeatedly, in the same currency.

The IBM Cost of a Data Breach report is consistent year after year: organisations with mature automation and pre-determined playbooks detect breaches faster, contain them sooner, and spend substantially less recovering. The gap between high-automation and low-automation organisations regularly runs to millions of pounds. The decision to pre-determine your containment actions is not only a security decision. It is a financial one. It is a governance one.

And if you are operating under NIS2's Article 20 — which imposes personal liability on senior management for security decisions — it is worth noting that pre-authorising a playbook is itself an accountable senior management act. It moves accountability upstream to where it belongs, rather than letting it fall onto a tired analyst at midnight who was never given the authority to act anyway.

I can hear the CrowdStrike objection forming. In July 2024, a faulty content update bricked 8.5 million machines globally, cascading through hospitals, airlines, and financial institutions. If automation caused that, surely we should require human approval for every action? This inverts the lesson. CrowdStrike's failure was not the concept of automated action. It was the absence of constraints — no blast-radius ceiling on what the automation could touch, no confidence threshold that would have slowed the rollout. That is a classification failure, not a pre-authorisation failure. "Automated deployment without constraints" is not the same thing as "pre-authorised containment within classified thresholds." One is recklessness. The other is what I am arguing for.

Article content
The playbook written at 2pm is the one that works at 4am

Failing to Plan

The 1983 Beirut barracks bombing killed 241 American service members in an attack that lasted under three seconds. I want to be honest about what this shows — and what it does not. The marines were operating under pre-determined rules: peacetime rules of engagement that required weapons to be kept unloaded. Pre-determination was present. The problem was that those rules were wrong for the environment. The scenario had been misclassified. A deteriorating security situation had not triggered a reclassification of what rules applied. The lesson is not "pre-determination is dangerous." The lesson is harder. Pre-determine carefully. Review regularly. Build in override mechanisms. A pre-determined rule that is never reviewed is just a different kind of failure to plan.

Technologies change. Human nature does not.

The cognitive limitations that made Petrov's midnight decision so terrifying in 1983 are the same limitations affecting your night shift analyst tonight. The tendency of leaders to retain approval authority as a comfort blanket — to avoid making the hard pre-determination because doing so means accepting accountability in advance — is not new. And I want to be honest about why: the pre-determination conversation itself is politically difficult. It requires the people in the room to agree, in writing, on which actions can be taken without their involvement. That is an act of trust in their teams and a surrender of after-the-fact control. Many leaders would rather preserve the option to decide in the moment — even knowing they will decide badly — than sign a document that removes them from the chain. It is as old as the Roman Senate debating whether to grant a legate imperium on the frontier. The research on avoidant authority patterns is depressingly consistent: leaders uncertain of their own accountability push decisions downward during calm periods — "use your judgement" — and pull them upward during crises, creating exactly the 4am escalation bottleneck that guarantees slow response and maximum adversary advantage. The analyst who was told to use their judgement discovers, at the moment they most need to act, that their judgement is not trusted after all.

I am not arguing for full automation. I am arguing for Sullenberger's principle: the checklist exists before the emergency. Make the foundational decisions when you can. Pre-determine the thresholds. Pre-authorise the actions within them. Leave humans in the loop for judgements that genuinely require human judgement — and remove them from the approvals that do not. Intelligent deviation is possible only because the foundation was built in daylight.

The analyst who wakes you at 4am has not failed you. You have failed them. You gave them no pre-determined authority to act. Your adversary counted on exactly that.

The next incident is already in your future. Somewhere in your network, a decision is waiting to be made at the worst possible moment by the least prepared person available — just as it waited for Petrov in the dark at Serpukhov-15. The only difference is that you still have time to make it at 2pm.

Make the decision now.

(Views in this article are my own.)

🦩