Episode 61: Millennium perdition

Insurance Technology Diary

Episode 61: Millennium perdition

Guillaume Bonnissent’s Insurance Technology Diary

People of my generation will remember Windows ME, the ‘Millennium Edition’ of the ubiquitous operating system, without fondness. It was an absolute disaster.

Windows ME familiarised everyone with the ‘Blue Screen Of Death’, which happened more often than morning coffee. To make that even worse, ‘System Restore’, the new ME feature which should have been the antidote to BSOD, rarely worked at all.

Microsoft launched into the new, light-speed century with a platform that ran in geological time. Most of my old peripherals drivers were not compatible with ME, but the new ones didn’t work either. The security was almost non-existent, making the new OS a playground for the swelling hacker community. Once the initial joy that accompanies any ‘upgrade’ had waned, Millennium’s predecessor, Windows 98, became a very pleasant memory. Alas, there was no going back.

Many years later, Microsoft sort of admitted that Windows ME was released too soon. Its creators knew it wasn’t quite ready. It was a halfway house between 98 and NT, and had been granted only half the development-cycle time of Windows 98, because the name on its face was an immovable deadline.

It hadn’t been tested properly. Instead, ME’s millions of unsuspecting new users would be unknowingly recruited as de facto beta testers. Microsoft would use these silently consenting guinea-pigs’ feedback to create so-called ‘Service Packs’ that would update the OS to make it work properly (or that, at least, was the idea).

Meanwhile, ME was surrounded by an incredible amount of overboiled hype. It was the operating system that would save us. The TV ad promised “so many possibilities” through an “improved user experience.” It showed digital data swirling around the globe, effortless networking, and (ironically) the great System Restore innovation that would make our lives so much easier (or so the ads seemed to promise).

Some bits worked. They were great. Unfortunately, overall, Windows ME was junkware. It was swiftly superseded by the hugely successful Windows XP, to the relief of everyone with a keyboard.

In stark contrast, no one is about to withdraw generative AI, the technological advancement that is defining the decade. That’s despite its parallels with Windows ME.

As soon as the immature first release of ChatGPT hit our screens, competitors rushed to release their generative AI tools based on Large Language Models. They came thick and fast. In almost no time, GenAI was everywhere (even, infuriatingly, offering to help write my emails).

Almost all these systems (maybe all of them) were released too soon. In fact if not practice, everyone who uses ChatGPT, Claude, or any of the other ubiquitous GenAI applications that were rushed to market is an unwitting beta tester.

GenAI is as unreliable as Windows Millennium Edition, consistency-wise. Sometimes specific tools are incredibly useful, but sometimes not. My ChatGPT desktop app even admits, at the bottom of every screen, that “ChatGPT can make mistakes.” Those flub-ups can land you in court. If you’re already there, they’ve been known – often – to feed hapless users fabricated case law, for which there is no defence.

When I asked, ChatGPT explained this to me. It told me it is “tuned to say what seems true.” I asked for an explanation, and was told:

” • I’m optimized to approximate truthful, helpful answers,
• but I do not have guaranteed correctness,
• and I do sometimes produce false statements that sound right.”

Despite its acknowledged, built-in eagerness to please, once the world had seen GenAI in action, they immediately wanted more. To keep up with the competition, unready software was released too soon, and people began using-testing in their millions. In offices around the world, every second, people are using badly tested GenAI for real work. They’re relying on outputs from systems that declare their fallibility. Yet so popular is the new wizardry that some people have already lost their jobs for not using it.

In a way, the mass beta test approach makes sense. What better way to sort the bugs out? In practice, though, it’s hard to see a benefit for users. No one would buy medical devices or even children’s toys that failed as often as an AI prompt does.

Worse, our testing doesn’t even improve the AI. As my app also declares honestly, “OpenAI doesn’t use Guillaume’s workspace data to train its models.” So even if I try to get the correct answer a few times using different prompts, the only output that will improve is today’s. The system learns nothing from my struggle.

AI was already everywhere before the explosion, but it lurked behind the scenes. It had been around for years, but no one outside programming circles and tech buffs paid it any attention. No pressure was felt to issue half-tested, half-baked tech to anyone but qualified testers.

When we use AI in an insurance technology platform (as we do in all of them, and have done for years), we test the function over and over again. When we ask it to parse 10-Ks or to check routes against sanctions, we try it 10 to the n times before it’s let loose on a client to operate in anger. It all happens in the background, without fanfare.

When our developers use AI to suggest a code amendment, they review it thoroughly before accepting it. After that, it goes through the same stringent testing cycles we insist upon for code written by a human (and the testers love to find holes in AI’s work).

Everyone is beta-testing AI in the foreground now. It is often really useful, so it may seem like a fast, efficient version of a very bright employee in their late 20s. You tell it what to do, and it does it, sometimes very well.

Embrace AI. Challenge it. Love it, even. It’s definitely a critical component of the insurance technology’s future. But now, whilst it is still in the testing phase, make sure always to confirm its output. And like that promising young employee who is tangibly eager to please, you want to make sure it’s tried and tested, and tested again, before any big promotions are offered, or too much is left to ride on its work.

Not subscribed? Click here, and never miss an issue

Back to Insurance Technology Diary