cns me / blog Cloudy with a Chance of Freefall
← Index | | 7 min read

The morning Priya was failed by a tripwire she was never told about

AI Generated: Safety today families count on you

Priya processes supplier invoices in accounts payable at a mid-sized facilities company, and on a Tuesday morning, in the third hour of a queue of forty-one, she asks the company chatbot to total a payment schedule, copies the figure into an email, and presses send, which is the moment a product called Plumbline registers a hit.

The figure was wrong. Plumbline put it there.

That is the product I want to pitch to you.

Article content
AI Generated: Tired accounts worker at a monitor with a long invoice queue

Plumbline (tagline: "Trust, but verify your people") deliberately seeds misinformation into your AI chatbot's output, a tool your staff increasingly use the way they once used a calculator. A honeytoken, in security, is a decoy only an attacker would touch, like a fake password in a file so you know the moment someone opens it. Plumbline inverts the target: it plants the decoy in the answer your most diligent employee has been told to rely on. A wrong price. A transposed sum. A supplier discount that never existed. Each is tagged, so when Priya passes it on unchecked, the system records that she did, to whom, and when. The deck calls this "a human firewall," sold in "Vigilance" and "Vigilance Plus" tiers with a per-employee dashboard.

Trust, but verify your people

You can see why a board might buy it. Inaccuracy is the most commonly cited negative consequence organisations report from generative AI, according to McKinsey's State of AI survey, and a product promising to find the careless humans is, on a slide, an easy yes.

Article content
AI Generated: Boardroom projecting vigilance

The wrongness accumulates the longer you look. The chatbot was already wrong, often, before Plumbline arrived. Purpose-built, retrieval-backed legal AI tools, the expensive kind, were found by Stanford researchers to hallucinate, that is to state a confident falsehood, on more than one query in six for Lexis+ AI and around one in three for Westlaw. (Retrieval-backed means it answers from documents it looks up, meant to keep it honest. It does not, entirely.) So Plumbline adds deliberate errors to a substrate already wrong, then bills Priya for the gap.

The substrate also moves underneath her. The same model can quietly change behaviour between releases: one study of ChatGPT found GPT-4's accuracy on a simple maths task fall from 84 per cent in March 2023 to 51 per cent that June. Vendors swap versions without telling anyone, and outputs are non-deterministic by default. It is stable the way a river is stable, never the same water twice. Plumbline's answer is to watch Priya.

Article content
AI Generated: Stable the way the water is never the same twice.

I should be fair to the tool. The AI Priya uses often makes her work better: in one pathology study AI assistance raised diagnostic performance while introducing a 7 per cent rate of automation bias, the tendency to defer to the machine over your own judgement. That is why people trust it, and why a planted error slips through: a tool that helped you four hundred times this week has earned the trust that lets the four hundred and first answer pass unchecked, the trust Plumbline depends on to land its hit.

Here the pitch is at its most confident, and its most wrong. Its premise is that Priya can sit at a screen for hours and reliably catch the rare planted error among the real ones. She cannot, and not for want of diligence: it is what a nervous system can do.

In 1983 Lisanne Bainbridge set this out in "Ironies of Automation". When you automate the totting-up and leave the human to watch for the machine's slips, "it is impossible for even a highly motivated human being to maintain effective visual attention towards a source of information on which very little happens, for more than about half an hour." Her conclusion: "The human monitor has been given an impossible task." A separate review of boredom and vigilance for the Office of Naval Research found the same wall about thirty minutes in. This decay has a name, the vigilance decrement; it is not a character flaw but a property of attention.

Plumbline registered its hit against Priya in the third hour.

The failure it records cannot be trained away. Automation complacency, the tendency to stop checking a machine you rely on, "is found in both naive and expert participants and cannot be overcome with simple practice," according to Parasuraman and Manzey's review. Plumbline proposes to find this universal property in Priya, log it against her name, and sell the log to her employer.

I do not want to sell you Plumbline. I think it is a genuinely bad idea, and I give it away free so nobody builds it.

Article content
AI Generated: Employee shaming as a service

We have done this before. Plumbline is a phishing simulation wearing a new coat: an eight-month randomised trial of more than 19,500 employees found this kind of training "unlikely to offer significant practical value": what mattered was the lure, not the person. The UK's National Cyber Security Centre is blunter: "blaming users for clicking on links doesn't work," and "punishing people for clicking on emails you've sent starts to resemble entrapment." Joel Samuel , writing on defence in depth, the principle of many overlapping layers rather than one, notes that "the vast majority of phishing attacks are successful as a result of the IT and/or cybersecurity teams failing to implement good IT systems." The failure was upstream; the blame was aimed downstream. Safety is a property you design into a system, not a virtue you demand from a tired person inside it, which is why you should never trust a human in production.

A genuine canary token is a tripwire only an attacker would trip; the ethics are entirely in where you place the bait. Plumbline reverses the one thing that made it fair: it plants the bait inside the work, on the desk of the person doing the job she was told to do, with the tool she was told to use, in the third hour when no attention survives. Sadly, that is the part doing the real work.

It appeals because it locates the problem in a person, not a system. Sidney Dekker calls the two views of human error the bad apple and the bad barrel. When a system falls over, the mature response is the blameless postmortem, which asks what allowed the failure rather than who to punish, because AI tends to replace comprehension with the feeling of comprehension, and the lapse is structural, not personal. We extend that courtesy to our outages; Plumbline is built to deny it to Priya.

So what would the honest version measure? It would invert every cruel default Plumbline depends on.

  1. Consent and anonymity, not surveillance. You red-team your organisation, that is probe it for weaknesses on purpose, with staff's knowledge and against an anonymised population. You sample at organisation level, never the task: below a threshold of cases you measure nothing rather than out the individual. You learn that your workflow lets some AI errors through; you never learn it was Priya.
  2. Measure to size the risk, not to score the person. The same data answers two questions, and the ethics turn on which you ask. Used to score Priya it is both futile, because the lapse cannot be trained away, and cruel; used to size the error your process passes downstream it is the endorsed use. Being held accountable in the right way reduces automation bias; aimed at the individual, shame is a boomerang that comes back on the organisation that threw it.
  3. Build the missing layer, which is verification, not vigilance. Bainbridge's point was that watching is the wrong job to give a human. Checking is real, and sometimes only a person can do it; the failure is making the check covert, punitive and unaided, not the idea that work should be checked. The fix is a checking step in the system: a second source, a recomputation, a confirmation the tool itself surfaces, so catching the error does not depend on Priya's attention past the half-hour.
  4. Restrict the planted errors to inaccuracies the business already accepts, so the exercise measures a risk you chose to carry, not one you manufactured.

Article content
AI Generated: Triple check your results

The history rhymes. The Boeing Model 299 in 1935 was judged "too much airplane for one man to fly," after which the answer was a checklist, not a memo blaming the pilot. Plumbline poisons by design the knowledge base I once called the colleague who never forgets. Strip the deck away and it is the paperclip maximiser as a business model: it manufactures the failure its customers fear most, then sells the cure for the disease it injects.

I would happily build the ethical version, the consented, anonymised, tool-measuring one; that is work I would take. The version that puts Priya's name on a dashboard, I would refuse.

It turns out the whole thing reduces to one image. Plumbline hands Priya a spreadsheet clever enough to write its own formulas, then forbids her to trust any sum it produces until she has re-checked it by hand on a calculator she has no time to reach for. If the sums cannot be trusted, the answer is a better spreadsheet, or a checking column the sheet fills in itself. It was never the abacus, and it was never Priya.

(Views in this article are my own.)

🦩