Episode 32: Testing times

Insurance Technology Diary

Episode 32: Testing times

Guillaume Bonnissent’s Insurance Technology Diary

Imagine arriving in 2025 from 1825, and encountering a golf cart. You would no doubt consider this four-wheeled marvel the most incredible piece of technology humankind had ever crafted.

If you didn’t look further, perhaps you’d immediately swap your time machine for the golf cart, then get on with exploring the wonders of the 21st century. You probably wouldn’t realise until far too late that you could’ve swapped your ride home for a Tesla, or a Lamborghini, or even a double-decker bus.

When it comes to AI, people choose the golf cart every day. Do you pile into Anthropic’s Claude 3.7 Sonnet, which claims to be the first with “hybrid reasoning”? Or do you go with cheap-to-run DeepSeek, which (along with everything else) triggered the biggest single-day loss in US stock-market history? Maybe you still use ChatGPT.

No two AI engines behave the same way, let alone deliver the same results. Some are better at some functions, but appalling at others.

My old friend James Grant, co-founder of the wonderfully named consultancy iwantmore.ai, suggests a simple but surprisingly uncommon way to ensure you are using the most suitable AI tool: before you settle, try getting a bunch of different ones to do what you actually do.

“If you’re a lawyer,” he says, “get it to summarise a contract. If you’re in training, ask it to create a lesson plan. If you’re in marketing, see if it can write copy that sounds even remotely human.” I add: if you’re an underwriter, get it to assess a risk.

Most of the plethora of new AI models do fairly well when asked, for example, to write a LinkedIn post about AI (note: AI does NOT write Insurance Technology Diary). But they may not do more specific tasks equally well. Many will produce laughable outputs when asked to do what you do daily, as second nature.

Try this test, Guru James suggests. Ask your AI to draft an email you really need. If reading and editing it takes longer than it would to write it in the first place, the AI made your life more difficult. But if it works, don’t stop. Demand ever-harder tasks. Find out what makes it fall over. Ask it to do the impossible. See how it responds.

Request nuanced performances, like ‘explain the benefits and drawbacks of parametric insurance compared to conventional, both from buyers’ and sellers’ perspectives’. Ask it something really current: What damage was caused in last week’s cyclone over La Reunion? Add something really niche that you understand really well: ‘How do I achieve the best balance between coverage and exposure when structuring a D&O policy?’

If the AI can’t cope, test another. If it’s wrong but confident, definitely test another. Give each exactly the same prompts to make the test fair. Even if the output’s good, test another. Then another, and so forth until it does exactly what you want, and makes your job easier.

Then choose. And beware: you may need different models for different tasks.

Never ride a golf cart when you could have a vintage Harley!

Not subscribed? Click here, and never miss an issue

Back to Insurance Technology Diary