GPT-5 Kills RAG. As For Healthcare, It's OpenAI's Next Big Gamble

While GPT-5 crushed it in other areas, it's a step back for healthcare. I break down GPT-5's pros and cons, and pit it against other major LLMs in a head-to-head race on challenging NEJM cases.

Aug 12, 2025

∙ Paid

Source of the original image: AI CERTS at https://www.aicerts.ai/news/top-gpt-5-features-and-how-theyre-changing-ai-in-2025/

Welcome to AI Health Uncut, a brutally honest newsletter on AI, innovation, and the state of the healthcare market. If you’d like to sign up to receive issues over email, you can do so here.

GPT-5, promised back in 2023, is finally here. Suffice it to say, the reaction across the internet and social media has been underwhelming, to put it mildly.

Source: Dr. Jeffrey Funk at https://www.linkedin.com/posts/dr-jeffrey-funk-a979435_hype-technology-innovation-activity-7359897907354882052-JT7v

Let me calmly dissect what’s good and what’s not, especially through the lens of healthcare.

But first, a quick announcement. On September 11, 2025, I’ll be moderating the panel “GenAI in Healthcare: A Conversation with Foundation Model Builders” at the Prax AI x Healthcare Summit in NYC. I don’t say this often, but the agenda for this conference actually looks really strong, especially if you’re in healthcare AI.

Get $100 off the ticket price when you become a paid subscriber to AI Health Uncut.

OK, now back to GPT-5... 😉

Here’s the kicker: this is actually a two-part article. I’ve teamed up with an up-and-coming (and insanely prolific) Substack star,

Maria Sukhareva

, whose rapidly growing newsletter AI Realist
is quickly becoming essential reading on AI.

Maria works at Siemens and is the ultimate LLM expert. If you need only one source of truth on LLMs, subscribe to AI Realist, sit back, and absorb the knowledge. What I appreciate most is that she gets straight to the point — no hype, no pandering to “AI gurus.” Reminds me of my own writing. 🙂

Subscribe to AI Realist

Ever heard of ‘fairwashing,’ or ‘post-hoc rationalization,’ or ‘toxicity filtering’ in healthcare AI? You should. In the other half of this two-part series, “OpenAI’s Healthcare: Analyzing the Promises of AI Medical Miracles,” Maria dives deep into these, and many other, critical themes.

OK, here is TL;DR:

1. GPT-5: Key Facts in Quick Bullet Points
2. Let’s Be Honest: GPT-5 Is the Best RAG Around
3. The Moment Dr. Derya Unutmaz Started Hyping GPT-5, My BS Alarm Went Off
4. OpenAI Wants Your Biopsy But Can’t Keep Your Medical Secrets
5. OpenAI’s HealthBench: The Moment I Realized AI in Medicine Is Just Prompt Engineering
6. Beating Benchmarks Ain’t Hard When You Train on Them
7. The Real Reason OpenAI Doesn’t Want You Fine-Tuning
8. NEJM Hard Cases Show GPT-5 Ain’t That Special 😢
9. Conclusion: GPT-5 Is Not AGI — It’s a Step Toward AI Commoditization and Cost Reduction, Not Toward Healthcare

1. GPT-5: Key Facts in Quick Bullet Points

For those who actually have a life and aren’t obsessively tracking every single AI model release, here’s a quick rundown of the general positives and negatives of GPT-5:

GPT-5 Positives:

Tops community and composite leaderboards like LMArena and Artificial Analysis.
New “unified system” that routes between a fast model and a deeper reasoning model for better cost and quality.
Larger 400K context in the API.
GPT-5 is probably the best retrieval-augmented generation (RAG) system we’ve got. (I break it down in section 2 below.)

Fewer hallucinations, according to OpenAI. (Though Maria’s work, and that of others, casts serious doubt on that claim.)
Strong coding gains at lower price points. Main GPT-5 is cheaper than Claude Sonnet and competitive with Gemini Pro. Multiple mini and nano tiers.
Clear framing that “abilities will develop more slowly than products,” extending the product overhang as real-world UX improves.
Cheaper models could boost adoption via the Jevons paradox — the idea that when a resource or technology becomes cheaper or more efficient, total consumption often increases rather than decreases because lower cost spurs greater use. I put this under positive, but it could just as easily be a negative.
Performance trend remains positive and aligned with METR (Measuring AI Ability to Complete Long Tasks) timelines rather than a sudden step change.

GPT-5 Negatives:

Launch was messy. Mislabeled plots. Buggy live demos. Odd rollout. You don’t need to be
Gary Marcus
to see that. 😉
Not a dramatic capability leap. Incremental gains rather than a step change.
Router adds uncertainty for power users about when the system “thinks” versus is answering fast.
Hype versus AGI expectations created a perception gap. Fundraising tied to AGI narratives could get harder.
The last two years of promises that GPT-5 might be that distant star called AGI (Artificial General Intelligence), or even ASI (Artificial Superintelligence), and now the complete crash of those premises feels deeply disappointing. Sam Altman’s pre-launch childish hype didn’t help and came off as mostly about his ego.
GPT-5 turned out to be more about lowering costs for OpenAI than about pushing the boundaries of the frontier.
Diminishing returns from brute-force scaling. Cheaper is beating bigger.
Slower visible benchmark progress invites critics to claim “AI is stalling.”
Possible slowing of the “money train” as infrastructure spend cools. More pressure to ship real products.
GPT-5 isn’t ready for healthcare. And that’s exactly the point of this article.

Let’s dive in to the positives first.

2. Let’s Be Honest: GPT-5 Is the Best RAG Around

Keep reading with a 7-day free trial

Subscribe to AI Health Uncut to keep reading this post and get 7 days of free access to the full post archives.