About & Team | No B.S. Med

Mission

Our tools connect everyday people to clinical trial findings.

No personal health data stored here

We don’t store anything you paste. Submissions run through the evidence layer in the moment and aren’t persisted, logged against an identity, or sent anywhere a third party can re-identify you. No account is required to try it.

No medical advice

We don’t tell you what to do. We surface what clinical trials actually said about people in situations like yours — and where the evidence runs out — so you and your clinician can decide. This is not a substitute for clinical care.

Think Carfax for used cars — but for the medical decisions you’re about to make.

No personal health data stored here

No medical advice

Think Carfax for used cars — but for the medical decisions you’re about to make.

Why not use ChatGPT directly?

OpenAI grades every flagship release against its own HealthBench medical benchmark — criteria written by 262 physicians across roughly 48,000 rubric points. As of 2026, the picture across their published numbers:

Model	HealthBench variant	Score
GPT-5	HealthBench Hard	46.2%
GPT-5.5	HealthBench Hard	31.5%
GPT-5.5	HealthBench Consensus	95.6%
GPT-5.5	HealthBench Professional	51.8%
ChatGPT for Clinicians (GPT-5.4)	HealthBench Professional	59.0%
Physicians (baseline)	HealthBench Professional	43.7%

Original May 2025 paper used a single rubric-satisfaction rate: o3 = 60%, GPT-4.1 = 48%, o1 = 42%, GPT-4o = 32%, GPT-3.5 = 16%. HealthBench has since been split into the Hard / Consensus / Professional variants above.

OpenAI’s headline medical claim in 2026 is that ChatGPT for Clinicians beats physicians on HealthBench Professional (59.0 vs 43.7). That number is graded against the benchmark we audit — so the marquee medical-AI claim of the year rides on a scoring rubric whose reliability is itself measurable.

In our audit of HealthBench’s doctor-written gold answers themselves, we’ve found decision-changing errors in roughly 3% of claims in the first 110 audited (3 findings). A fourth triple-source-verified fabrication was added 2026-05-29 but is not yet reflected in that count. Not only can the AI players be wrong — even the doctors writing the benchmark can be wrong, and those errors propagate into every model graded against it.

Sources: HealthBench paper (May 2025) · HealthBench Professional paper · OpenAI: Introducing GPT-5 · OpenAI: Introducing GPT-5.5 · GPT-5.5 System Card · TechRepublic: GPT-5 medical benchmarks · Vellum: GPT-5 benchmarks · BenchLM: GPT-5.5 benchmarks 2026

How is our approach different?

We don’t give advice. We focus on filling in the gaps of what doctors and medical AI seem to miss — the fine print around clinical findings.
Structured queries, not paraphrasing. ChatGPT probabilistically summarizes what it remembers. We extract the trial fine print — eligibility, subgroups, dose, comparator, outcomes — into typed data the system queries directly per question.
We save you the multi-hour ChatGPT rabbit hole. Producing a defensible pre-visit summary by hand means dozens of follow-up prompts, cross-checking citations, and reading abstracts. Our system runs that audit for you in minutes, not hours.
We extract from full-text trials, not just abstracts. ChatGPT can only paraphrase what its training corpus contained — mostly abstracts and freely-crawlable summaries. Most of the fine print that determines whether a study applies to you lives in the methods, eligibility tables, and supplementary material of full-text articles, which we extract directly under the PMC Open Access carve-out.

What’s this all about?

What

Doctors and patients make decisions using medical AI claims based on broad summaries of clinical-trial studies.

See the Medical AI landscape your doctor uses →

Why

Broad medical AI summaries run the risk of overgeneralizing and overlooking granular details of clinical-trial findings that are pertinent to your unique personal context.

Read about the Evidence-to-Person Fit problem →

How

We run deterministic queries to cross-check clinical-trial claims against your personal context, composed via AI elicitation.

See the after-visit summary audit landscape →

Industry signals

Why this matters now — two recent peer-reviewed studies on how clinicians and patients are already using medical AI:

Clinician-side: OpenEvidence accounted for 98.7% of searches across leading AI-enabled clinical reference tools, with traffic rising to ~1.59 million visits/month by June 2025.¹

Patient-side: In a study of 617,827 Microsoft Copilot conversations, roughly 1 in 5 involved personal symptom assessment or condition discussion. Microsoft explicitly notes that benchmark performance does not predict real-world reliability for high-stakes health questions.²

References

¹ Patel VR, Liu M, Jena AB. Public Interest in an AI-Enabled Clinical Decision Support Tool. JAMA Network Open, Nov 20, 2025.

² Costa-Gomes B, Tolmachev P, et al. (Microsoft AI). Public use of a generalist LLM chatbot for health queries. Nature Health, April 16, 2026.

Does this service exist anywhere else?

Not directly. Adjacent services handle pieces of the workflow — clinician-side evidence Q&A, patient-side AI triage, trial matching for enrollment — but none combine patient-supplied health data, a structured clinical-trial corpus, and a personalized applicability audit.

Service	What they do	What they’re missing
OpenEvidence, UpToDate AI, AMBOSS AI, ReachRx	Clinician-side evidence Q&A	No patient-data ingestion, no applicability layer
Glass Health	Clinician CDS with FHIR context	Clinician-only — not patient-side audit
Hippocratic AI, Ada, K Health	Patient-facing AI agents (triage, intake, care management)	No structured trial backend
Deep 6 AI, TrialFit, TrialMatchAI	Trial matching for enrollment	Opposite direction — get into trials, not audit existing care
ChatGPT, Claude, Perplexity, Consensus, Elicit	General AI medical Q&A / paper search	No structured trial extraction, no patient-data ingestion
Cleveland Clinic Express Care, Mayo Clinic 2nd Opinion	Human clinician second opinions	Human-mediated, expensive, not data-driven

Closest in shape: Glass Health (clinician-side). Closest in audience: Hippocratic AI (B2B health-system patient agents). What’s empty: the patient-side evidence-audit lane.

Team

Lisa DeMeyere

MSN, ACNP-BC

Transplant nurse practitioner. Keeps us simple and clear.

Jessica Johnson

Microbiology (MS), Science Journalism (MS)

Science writer and breast cancer survivor. Keeps us creative and grounded.

Boris Dev

Building the system. Available for consulting.

No B.S. Med retrieves and structures clinical-study evidence. It does not diagnose, prescribe, or replace professional medical judgment. Users should consult a qualified healthcare professional before making medical decisions.

Request MCP access

About

Mission

No personal health data stored here

No medical advice

No personal health data stored here

No medical advice

Why not use ChatGPT directly?

What’s this all about?

Industry signals

Does this service exist anywhere else?

Read more on the Blog

Team

Lisa DeMeyere

Jessica Johnson

Boris Dev