How reaso.ai works

Every claim is run through an editorial pipeline that restates the literal claim, identifies where scientific consensus stands, enriches findings with attribution and study type, and writes a neutral story-voice analysis. No verdict labels. No "honest bottom line" prose. The structure of the disagreement carries the information.

The pipeline

Step 1 — Restate the claim. The classifier first restates the claim's literal proposition verbatim, paying attention to polarity. A claim like "X is a waste of money" stays a negative assertion; a claim like "X causes Y" stays a positive one. No softening to a more defensible form.

Step 2 — Classify consensus. A polarity-neutral classifier runs three times in parallel and takes the majority vote, assigning one of five states: consensus aligns, consensus contradicts, genuinely mixed, emerging, or abandoned by proponents. The label describes the relationship to the claim text, not to the topic in general.

Step 3 — Enrich findings. Each evidence item is enriched with publication year (parsed from URLs or fetched from PubMed/Crossref where possible), attribution (lead author, institution, journal, method), story role (originating, replication, critique, extension), the memorable specific worth preserving, and source provenance. Conservative throughout — null is preferred over invention.

Step 4 — Write the narrative. A neutral third-person narrator tells the story of how the field came to its current understanding: setup, finding, contestation, mechanism, counter-evidence, where it sits now. The narrative ends openly. It does not synthesize a final position. Banned phrases include "the bottom line," "the honest truth," "in conclusion," "ultimately."

The five consensus states

Consensus aligns with claim — broad scientific consensus, with empirical basis, supports what the claim literally asserts. Challengers exist but represent minority positions.

Consensus contradicts claim — broad scientific consensus, with empirical basis, contradicts what the claim literally asserts. Defenders of the claim exist but represent minority positions.

Mixed evidence — active scientific debate where two or more camps each produce real new evidence on the same operationalised question.

Emerging research — the finding is new, not yet broadly replicated or contested. Consensus has not had time to form.

Abandoned by proponents — the original testable form of the claim has been refuted, and its proponents have pivoted to reframings, motivational arguments, or different outcomes rather than producing new evidence on the original testable form.

No verdict pill. No traffic-light coloring. The structure of where the field actually stands is what matters.

Why some verdicts stay "mixed"

LLMs produce probability distributions over answers. On a settled claim — does smoking cause cancer — the distribution collapses hard to a single verdict. On a contested claim, the distribution is genuinely split, because the evidence is genuinely split.

The classifier runs three times and takes the majority vote to suppress sampling noise, but it won't force a verdict on a claim where the research legitimately doesn't settle. The five-state vocabulary exists so honest uncertainty has somewhere to land.

A classifier that always picked "aligns" or "opposes" would be confidently wrong on exactly the cases where reality is contested. We'd rather tell you the evidence is mixed than pretend it isn't.

The architecture around the classifier is designed to strip every non-evidence source of variance — unified source counts across surfaces, a durable content store that deploys can't accidentally wipe, consistent schemas across CLI-seeded and live-analyzed claims, and a prompt that explicitly decomposes compound claims into sub-components before classifying. When a verdict moves, it should be because the evidence moved — not because the plumbing jittered.

AI models used

Claude Sonnet (Anthropic) — runs every step: claim restatement, consensus classification (3-way majority vote), finding enrichment, and narrative writing.

Serper API (Google Search) — finds evidence from peer-reviewed journals, institutional reports, and news sources during the original analysis pipeline.

PubMed Central + Crossref APIs — used during enrichment to backfill publication years from PMC IDs and DOIs.

Who built this

reaso.ai was created by Thiago Alvarez — a software engineer building AI tools that help people think more clearly about evidence.

The project is independent and self-funded. Not affiliated with any health organization, supplement company, or political group. The only agenda is making the structure of scientific disagreement legible.

Limitations

No individual human reviewer signs off on each analysis. The pipeline produces the analysis; no specialist reads it before publication. We're working on adding named reviewers per claim. Until then, treat every analysis as a starting point for your own reading, not as a substitute for expert judgment on consequential decisions.

AI can misinterpret study findings, miss relevant research, or surface lower-quality sources alongside higher-quality ones. Evidence searches are limited to publicly accessible sources — paywalled studies may be missed.

The majority-vote classifier reduces sampling noise, but a genuinely borderline claim can still land on different states across independent runs — that variance usually reflects honest uncertainty in the evidence itself rather than a flaw in the pipeline (see "Why some verdicts stay mixed"). Year and attribution are extracted conservatively — many findings render without a year because we'd rather show no year than invent one.

Analyses reflect evidence available at the time of generation and may shift as new research emerges. This tool reports the state of the literature — it does not provide medical, nutritional, legal, or professional advice.

Contact

Found an error? Have feedback? Reach out at hello@reaso.ai.