AI makes errors at least 30% of the time in my experience

trilobite · 2026-05-02 22:42:02

I'm a doc / psych, with extensive research / data / statistical experience. I've asked ChatGPT and Duck.ai many questions about medical and mental health things I know very well, after 45 years of practice. Rough guess-timate is that about 30%, close to ⅓ of the responses to moderately complex medical questions are wrong. By moderate, I mean things about medication interactions, indications for prescriptions, off label prescribing, methods of non-medication therapy.

"Do not trust" is all I can advise.

johnraff · 2026-05-03 00:19:00

I think AI has its uses, but totally agree about "do not trust". Often it's 180deg out. Point it out and you get "oh yes I was wrong about that...".

Sun For Miles · 2026-05-03 07:31:13

LLM based "AI" should not used in any sort of decision making process at face value, or as you found out as a fact checking machine. If used in medical field, take it's output as a suggestion from a fresh joiner in specialization programme (at best). It could give you some good ideas in early stages of research, maybe.

glittersloth · 2026-05-03 08:24:28

The main problem with LLMs are the confidence levels with which they project their bullshit onto less informed users; https://theconversation.com/half-of-ai- … udy-280512

But they do seem to be improving. 18 months ago ChatGPT couldn’t even give me the MTU size for a simple WireGuard tunnel deployment (it failed to account for PPPoE overhead, which I specified in the query), but now it seem to be better. Still doesn’t know how many r in “strawberry” though.

Look on the bright side. At least it’s not recommending some sort of ayurvedic placenta beverage… I hope not.

johnraff · Yesterday 04:15:08

glittersloth wrote:

The main problem with LLMs are the confidence levels with which they project their bullshit onto less informed users; https://theconversation.com/half-of-ai- … udy-280512

This, very much so. Thanks for that excellent link.

LLMs are great at conversing - that's what they were made for. I think the process of gathering patterns from huge bodies of language and attaching tokens should mean they're pretty good at translation too. Isn't that something like how humans learn languages?

glittersloth wrote:

But they do seem to be improving. 18 months ago ChatGPT couldn’t even give me the MTU size for a simple WireGuard tunnel deployment (it failed to account for PPPoE overhead, which I specified in the query), but now it seem to be better.

My impression too, though your example is outside my comfort zone (found this). My scripting queries seem to get more useful answers these days, and sometimes find things that would have taken much reading of man pages and searching of stackoverflow questions. When you do get an answer it's very easy to go back to the man page and find that inconspicuously hidden line you missed before.

article wrote:

Language models do not know things. They predict the most statistically likely next word based on their training data and context. They do not weigh evidence or make value judgments.

Weighing evidence is still something that humans are good at.

But computers are good at following complicated trains of logic, and AI isn't all about LLMs. I'm presuming that the chatbots are able to do a certain amount of thinking themselves?

hhh · Yesterday 07:43:19

Isn't current AI (and what the links are saying) in most scenarios still "algorithm makes the most expected response based on available data?" Basically lying to you. Of course it's confident, it's a robot.

Chatbots are a great example as an end user, obvious grammar mistakes for one thing (trying to translate into every other language and back again? Just learning from the now-many gibberish/bladerdash English and every other language speakers garbled with text-speak and slang? Whatever, terrible phrasing, grammar and sytnax for basic sentences). No they will not think, not until Skynet becomes self-aware at 2:14 a.m. Eastern Time, August 29th, 2019. (Movie flaw by James Cameron, a Terminator would say "zero two one four hours Eastern Daylight Time". That's what Google's AI says it would say, so... fact.)

Watson and AlphaGo in the early 2010s were good, but they had specific tasks. Something got skipped turning those into on-demand answers from a cloud downloaded via cellphone.

johnraff · Yesterday 08:03:14

hhh wrote:

Isn't current AI (and what the links are saying) in most scenarios still "algorithm makes the most expected response based on available data?" Basically lying to you. Of course it's confident, it's a robot.

That's LLMs, and totally correct AFAIK. And what everyone suddenly started talking about when ChatGPT burst on the scene is LLMs, and what people generally mean by "AI". But there's more to it than that.

A lot of programmers are writing a lot of code with the help of AI, and that's not about rehashing stackoverflow posts.

hhh · Yesterday 08:11:31

@johnraff, I've not read into details of AI coding's ramifications and abuses, but I know the articles are out there...

https://futurism.com/artificial-intelli … churn-code
https://www.anthropic.com/news/finance-agents
https://garymarcus.substack.com/p/dario … ty-and-the

"Tried to refactor it myself last week, I gave up after two hours. The thing is so tangled that touching one part breaks something completely unrelated.

The generation was fast. The cleanup is a nightmare."

hhh · Yesterday 08:15:34

Anecdotal, but the guy is supposed to be an AI expert.

johnraff · Yesterday 08:24:26

@hhh thanks, I'm out of time today but I'll read those links tomorrow.

This is all so new, and changing every day. Opinions are really divided, but I personally don't doubt that AI is here to stay, and about to make huge changes in our lives, for good and/or ill.

hhh · Yesterday 08:31:09

*Morgan Freeman* In fact, it was not for good after all.

manyroads · Yesterday 12:39:23

As a guy who has worked with AI tech since the mid 1980s I can say AI are tools. And remember the old adage: A fool with a tool is STILL a fool.

As tools AI can be extremely useful. They need to be closely monitored and guided (managed). Remember you are the ethics, you are the expert. Think ofan AI as a hammer or saw and yourself as the carpenter. Clearly they are more complex and powerful than hammer or saw- but the logic holds. They can research with you, but you need to tell them what you seek. They can be fast.. but seeking the wrong material rapidly is rarely better than seeking wrong material slowly. When they find material, you need to evaluate its goodness. They will can correlate across domains... but again you need to apply the understanding, ethics, context, etc.

AIs are not magic >> "poof a miracle occurs". The billionaires market them as if they are, but the oligarchs simply want your engagement, money, and obedience. Sad to say the Tech Oligarchs are simply using AIs to print money not save the planet or humanity. They are totally mercenary.

There are many safe ways to use these tools- the providers generally don't want you to use them in those ways because THEY won't make any money.

Off my soapbox. Caveat emptor.

FWIW: I have published about 400 pages of research for any who might like to wade through it- just ask. When I put my links on Forums I get thrown off because I'm trying to self-promote that which I don't sell. Anyway, remember AI are tools.

hhh · Yesterday 15:32:07

I will say, as much as I hate Google for what they've become, I do like that the Android algorithms warn me of scam phone calls, and are right 99 percent of the time, probably closer to 99.8.

johnraff · Today 06:00:31

@hhh thanks for the links.
futurism seem to have a large body of ai-sceptic material. OK.
anthropic's finance agents - a bit complicated but they sound confident...
Gary Marcus' article seemed pretty much on target.

In the hands of very skilled practitioners who pay a lot of attention, and treat the outputs with considerable scrutiny, coding agents can be astonishing.

This Wikipedia page on "generative AI" looks like a good summary: https://en.wikipedia.org/wiki/Generative_AI

johnraff · Today 06:08:19

@manyroads I have only dabbled with AI a couple of years for doing simple tasks, simply because (sometimes) it saves time, but I find myself totally agreeing with what you wrote. Maybe an extension of your metaphor would be to regard AI tools as bandsaws and power nailers. Get the work done really fast, but mind you don't lose a finger!

And those Oligarchs - like, a shop round the corner has started giving away chainsaws...

johnraff · Today 06:40:09

@trilobite thank you for starting this interesting topic!

johnraff · Today 06:55:20

glittersloth wrote:

ChatGPT...Still doesn’t know how many r in “strawberry” though.

Did a quick check, via duck.ai:
How many letter 'r' s are there in the word 'strawberry'?

GPT5-mini: 2
GPT-4o mini: 3
gpt-oss 120B: 3
Llama-4 Scout: 3
Claude Haiku 4.5 : 3
Mistral 4 small: 3

#1 2026-05-02 22:42:02

AI makes errors at least 30% of the time in my experience

#2 2026-05-03 00:19:00

Re: AI makes errors at least 30% of the time in my experience

#3 2026-05-03 07:31:13

Re: AI makes errors at least 30% of the time in my experience

#4 2026-05-03 08:24:28

Re: AI makes errors at least 30% of the time in my experience

#5 Yesterday 04:15:08

Re: AI makes errors at least 30% of the time in my experience

#6 Yesterday 07:43:19

Re: AI makes errors at least 30% of the time in my experience

#7 Yesterday 08:03:14

Re: AI makes errors at least 30% of the time in my experience

#8 Yesterday 08:11:31

Re: AI makes errors at least 30% of the time in my experience

#9 Yesterday 08:15:34

Re: AI makes errors at least 30% of the time in my experience

#10 Yesterday 08:24:26

Re: AI makes errors at least 30% of the time in my experience

#11 Yesterday 08:31:09

Re: AI makes errors at least 30% of the time in my experience

#12 Yesterday 12:39:23

Re: AI makes errors at least 30% of the time in my experience

#13 Yesterday 15:32:07

Re: AI makes errors at least 30% of the time in my experience

#14 Today 06:00:31

Re: AI makes errors at least 30% of the time in my experience

#15 Today 06:08:19

Re: AI makes errors at least 30% of the time in my experience

#16 Today 06:40:09

Re: AI makes errors at least 30% of the time in my experience

#17 Today 06:55:20

Re: AI makes errors at least 30% of the time in my experience

Board footer