You are not logged in.

#1 2026-05-02 22:42:02

trilobite
Member
From: Saskatchewan, Canada
Registered: 2017-06-27
Posts: 168

AI makes errors at least 30% of the time in my experience

I'm a doc / psych, with extensive research / data / statistical experience. I've asked ChatGPT and Duck.ai many questions about medical and mental health things I know very well, after 45 years of practice. Rough guess-timate is that about 30%, close to ⅓ of the responses to moderately complex medical questions are wrong. By moderate, I mean things about medication interactions, indications for prescriptions, off label prescribing, methods of non-medication therapy. 

"Do not trust" is all I can advise.


{Linux-using people I haven't met are friends yet to be made.}

Offline

#2 2026-05-03 00:19:00

johnraff
nullglob
From: Nagoya, Japan
Registered: 2015-09-09
Posts: 13,238
Website

Re: AI makes errors at least 30% of the time in my experience

I think AI has its uses, but totally agree about "do not trust". Often it's 180deg out. Point it out and you get "oh yes I was wrong about that...".


...elevator in the Brain Hotel, broken down but just as well...
( a boring Japan blog (currently paused), now on Bluesky, there's also some GitStuff )

Introduction to the Bunsenlabs Boron Desktop

Offline

#3 2026-05-03 07:31:13

Sun For Miles
Member
Registered: 2017-04-12
Posts: 276

Re: AI makes errors at least 30% of the time in my experience

LLM based "AI" should not used in any sort of decision making process at face value, or as you found out as a fact checking machine. If used in medical field, take it's output as a suggestion from a fresh joiner in specialization programme (at best). It could give you some good ideas in early stages of research, maybe.


Señor Chang, why do you teach Spanish?

Offline

#4 2026-05-03 08:24:28

glittersloth
buena piñata
Registered: 2015-09-30
Posts: 1,542

Re: AI makes errors at least 30% of the time in my experience

The main problem with LLMs are the confidence levels with which they project their bullshit onto less informed users; https://theconversation.com/half-of-ai- … udy-280512

But they do seem to be improving. 18 months ago ChatGPT couldn’t even give me the MTU size for a simple WireGuard tunnel deployment (it failed to account for PPPoE overhead, which I specified in the query), but now it seem to be better. Still doesn’t know how many r in “strawberry” though.

Look on the bright side. At least it’s not recommending some sort of ayurvedic placenta beverage… I hope not.

Offline

#5 Yesterday 04:15:08

johnraff
nullglob
From: Nagoya, Japan
Registered: 2015-09-09
Posts: 13,238
Website

Re: AI makes errors at least 30% of the time in my experience

glittersloth wrote:

The main problem with LLMs are the confidence levels with which they project their bullshit onto less informed users; https://theconversation.com/half-of-ai- … udy-280512

This, very much so. Thanks for that excellent link.

LLMs are great at conversing - that's what they were made for. I think the process of gathering patterns from huge bodies of language and attaching tokens should mean they're pretty good at translation too. Isn't that something like how humans learn languages?

glittersloth wrote:

But they do seem to be improving. 18 months ago ChatGPT couldn’t even give me the MTU size for a simple WireGuard tunnel deployment (it failed to account for PPPoE overhead, which I specified in the query), but now it seem to be better.

My impression too, though your example is outside my comfort zone (found this). My scripting queries seem to get more useful answers these days, and sometimes find things that would have taken much reading of man pages and searching of stackoverflow questions. When you do get an answer it's very easy to go back to the man page and find that inconspicuously hidden line you missed before.

article wrote:

Language models do not know things. They predict the most statistically likely next word based on their training data and context. They do not weigh evidence or make value judgments.

Weighing evidence is still something that humans are good at.

But computers are good at following complicated trains of logic, and AI isn't all about LLMs. I'm presuming that the chatbots are able to do a certain amount of thinking themselves?


...elevator in the Brain Hotel, broken down but just as well...
( a boring Japan blog (currently paused), now on Bluesky, there's also some GitStuff )

Introduction to the Bunsenlabs Boron Desktop

Offline

#6 Yesterday 07:43:19

hhh
Gaucho
From: High in the Custerdome
Registered: 2015-09-17
Posts: 16,955
Website

Re: AI makes errors at least 30% of the time in my experience

Isn't current AI (and what the links are saying) in most scenarios still "algorithm makes the most expected response based on available data?" Basically lying to you. Of course it's confident, it's a robot.

Chatbots are a great example as an end user, obvious grammar mistakes for one thing (trying to translate into every other language and back again? Just learning from the now-many gibberish/bladerdash English and every other language speakers garbled with text-speak and slang? Whatever, terrible phrasing, grammar and sytnax for basic sentences). No they will not think, not until Skynet becomes self-aware at 2:14 a.m. Eastern Time, August 29th, 2019. tongue (Movie flaw by James Cameron, a Terminator would say "zero two one four hours Eastern Daylight Time". That's what Google's AI says it would say, so... fact.)

Watson and AlphaGo in the early 2010s were good, but they had specific tasks. Something got skipped turning those into on-demand answers from a cloud downloaded via cellphone.


I don't care what you do at home. Would you care to explain?

Offline

#7 Yesterday 08:03:14

johnraff
nullglob
From: Nagoya, Japan
Registered: 2015-09-09
Posts: 13,238
Website

Re: AI makes errors at least 30% of the time in my experience

hhh wrote:

Isn't current AI (and what the links are saying) in most scenarios still "algorithm makes the most expected response based on available data?" Basically lying to you. Of course it's confident, it's a robot.

That's LLMs, and totally correct AFAIK. And what everyone suddenly started talking about when ChatGPT burst on the scene is LLMs, and what people generally mean by "AI". But there's more to it than that.

A lot of programmers are writing a lot of code with the help of AI, and that's not about rehashing stackoverflow posts.


...elevator in the Brain Hotel, broken down but just as well...
( a boring Japan blog (currently paused), now on Bluesky, there's also some GitStuff )

Introduction to the Bunsenlabs Boron Desktop

Offline

#8 Yesterday 08:11:31

hhh
Gaucho
From: High in the Custerdome
Registered: 2015-09-17
Posts: 16,955
Website

Re: AI makes errors at least 30% of the time in my experience

@johnraff, I've not read into details of AI coding's ramifications and abuses, but I know the articles are out there...

https://futurism.com/artificial-intelli … churn-code
https://www.anthropic.com/news/finance-agents
https://garymarcus.substack.com/p/dario … ty-and-the

"Tried to refactor it myself last week, I gave up after two hours. The thing is so tangled that touching one part breaks something completely unrelated.

The generation was fast. The cleanup is a nightmare."


I don't care what you do at home. Would you care to explain?

Offline

#9 Yesterday 08:15:34

hhh
Gaucho
From: High in the Custerdome
Registered: 2015-09-17
Posts: 16,955
Website

Re: AI makes errors at least 30% of the time in my experience

Anecdotal, but the guy is supposed to be an AI expert.


I don't care what you do at home. Would you care to explain?

Offline

#10 Yesterday 08:24:26

johnraff
nullglob
From: Nagoya, Japan
Registered: 2015-09-09
Posts: 13,238
Website

Re: AI makes errors at least 30% of the time in my experience

@hhh thanks, I'm out of time today but I'll read those links tomorrow.

This is all so new, and changing every day. Opinions are really divided, but I personally don't doubt that AI is here to stay, and about to make huge changes in our lives, for good and/or ill.


...elevator in the Brain Hotel, broken down but just as well...
( a boring Japan blog (currently paused), now on Bluesky, there's also some GitStuff )

Introduction to the Bunsenlabs Boron Desktop

Offline

#11 Yesterday 08:31:09

hhh
Gaucho
From: High in the Custerdome
Registered: 2015-09-17
Posts: 16,955
Website

Re: AI makes errors at least 30% of the time in my experience

*Morgan Freeman*   In fact, it was not for good after all.


I don't care what you do at home. Would you care to explain?

Offline

#12 Yesterday 12:39:23

manyroads
Member
From: around here, somewhere
Registered: 2019-04-16
Posts: 161
Website

Re: AI makes errors at least 30% of the time in my experience

As a guy who has worked with AI tech since the mid 1980s I can say AI are tools.  And remember the old adage: A fool with a tool is STILL a fool.

As tools AI can be extremely useful.  They need to be closely monitored and guided (managed).  Remember you are the ethics, you are the expert.  Think ofan AI as a hammer or saw and yourself as the carpenter.  Clearly they are more complex and powerful than hammer or saw- but the logic holds.  They can research with you, but you need to tell them what you seek.  They can be fast.. but seeking the wrong material rapidly is rarely better than seeking wrong material slowly.  When they find material, you need to evaluate its goodness.  They will can correlate across domains... but again you need to apply the understanding, ethics, context, etc.

AIs are not magic >> "poof a miracle occurs".  The billionaires market them as if they are, but the oligarchs simply want your engagement, money, and obedience. Sad to say the Tech Oligarchs are simply using AIs to print money not save the planet or humanity.  They are totally mercenary.

There are many safe ways to use these tools- the providers generally don't want you to use them in those ways because THEY won't make any money. 

Off my soapbox.  Caveat emptor.

FWIW: I have published about 400 pages of research for any who might like to wade through it- just ask.  When I put my links on Forums I get thrown off because I'm trying to self-promote that which I don't sell. Anyway, remember AI are tools.


Pax vobiscum,
Mark Rabideau - https://treemagic.org  https://many-roads.com  https:/eirenicon.org 
i3wm, dwm, openbox on debian test ~ Reg. Linux User #449130
"For every complex problem there is an answer that is clear, simple, and wrong." H. L. Mencken

Offline

#13 Yesterday 15:32:07

hhh
Gaucho
From: High in the Custerdome
Registered: 2015-09-17
Posts: 16,955
Website

Re: AI makes errors at least 30% of the time in my experience

I will say, as much as I hate Google for what they've become, I do like that the Android algorithms warn me of scam phone calls, and are right 99 percent of the time, probably closer to 99.8.


I don't care what you do at home. Would you care to explain?

Offline

#14 Today 06:00:31

johnraff
nullglob
From: Nagoya, Japan
Registered: 2015-09-09
Posts: 13,238
Website

Re: AI makes errors at least 30% of the time in my experience

@hhh thanks for the links.
futurism seem to have a large body of ai-sceptic material. OK.
anthropic's finance agents - a bit complicated but they sound confident...
Gary Marcus' article seemed pretty much on target.

In the hands of very skilled practitioners who pay a lot of attention, and treat the outputs with considerable scrutiny, coding agents can be astonishing.

This Wikipedia page on "generative AI" looks like a good summary: https://en.wikipedia.org/wiki/Generative_AI


...elevator in the Brain Hotel, broken down but just as well...
( a boring Japan blog (currently paused), now on Bluesky, there's also some GitStuff )

Introduction to the Bunsenlabs Boron Desktop

Offline

#15 Today 06:08:19

johnraff
nullglob
From: Nagoya, Japan
Registered: 2015-09-09
Posts: 13,238
Website

Re: AI makes errors at least 30% of the time in my experience

@manyroads I have only dabbled with AI a couple of years for doing simple tasks, simply because (sometimes) it saves time, but I find myself totally agreeing with what you wrote. Maybe an extension of your metaphor would be to regard AI tools as bandsaws and power nailers. Get the work done really fast, but mind you don't lose a finger!

And those Oligarchs - like, a shop round the corner has started giving away chainsaws...


...elevator in the Brain Hotel, broken down but just as well...
( a boring Japan blog (currently paused), now on Bluesky, there's also some GitStuff )

Introduction to the Bunsenlabs Boron Desktop

Offline

#16 Today 06:40:09

johnraff
nullglob
From: Nagoya, Japan
Registered: 2015-09-09
Posts: 13,238
Website

Re: AI makes errors at least 30% of the time in my experience

@trilobite thank you for starting this interesting topic!


...elevator in the Brain Hotel, broken down but just as well...
( a boring Japan blog (currently paused), now on Bluesky, there's also some GitStuff )

Introduction to the Bunsenlabs Boron Desktop

Offline

#17 Today 06:55:20

johnraff
nullglob
From: Nagoya, Japan
Registered: 2015-09-09
Posts: 13,238
Website

Re: AI makes errors at least 30% of the time in my experience

glittersloth wrote:

ChatGPT...Still doesn’t know how many r in “strawberry” though.

Did a quick check, via duck.ai:
How many letter 'r' s are there in the word 'strawberry'?

GPT5-mini:    2
GPT-4o mini:    3
gpt-oss 120B:    3
Llama-4 Scout:    3
Claude Haiku 4.5 :    3
Mistral 4 small:    3


...elevator in the Brain Hotel, broken down but just as well...
( a boring Japan blog (currently paused), now on Bluesky, there's also some GitStuff )

Introduction to the Bunsenlabs Boron Desktop

Offline

Board footer

Powered by FluxBB