Discernment: reading and evaluating AI output

Check AI-generated work for accuracy by breaking claims into pieces you can verify one at a time, numbers you can recalculate, facts you can look up, and citations you can find in the actual source.

Time: 20–25 min
Type: exercise
Bloom: Apply → Create
XP: 100

Concept architecture for Discernment: reading and evaluating AI output — Lesson 2.7 — concept architecture

You'll be able to

Check AI-generated work for accuracy by breaking claims into pieces you can verify one at a time, numbers you can recalculate, facts you can look up, and citations you can find in the actual source.
Recognize the common ways AI output goes wrong: invented citations that sound real, patterns that fit the data but miss the real cause, and claims that aren't supported by the source even when the source exists.
Read an output through three practical lenses: is the content accurate, is your back-and-forth with the AI actually improving it, and will it behave correctly when real people use it.
Judge whether your back-and-forth with the AI is actually improving the output, and change your approach when it isn't, sometimes the problem is your prompt, sometimes it's the task you chose.
Build a personal checklist you can run before you release AI-assisted work, so nothing goes out with your name on it until you have checked it.

Key concepts · tap to reveal

1/15·Watch·Beat 1 · Hook

Hook

The A.I. gave you three confident paragraphs with dollar figures and a footnote. Two days later: Table 3A doesn't exist, and the numbers are wrong by millions.

Prompt Labruns here · claude

Your task Write a prompt that asks Claude to recommend the right AI setup for a real task you're facing — then weigh its answer against this lesson, "Discernment: reading and evaluating AI output."

a strong prompt:role · context · task · format · example

⌘↵ to run

Create a flowchart diagram showing the AI output evaluation process with four sequential decision nodes representing Anthropic's 4D Discernment framework: Detection (identify AI-generated content), Deduction (analyze logical structure and claims), Cr — Diagram · generated brief

Exercise · scenario

A pharmaceutical regulatory affairs specialist receives an AI-generated summary of adverse event reports from a clinical trial. The summary states: 'Headache was reported in 23% of treatment group participants versus 18% in placebo (p=0.04), representing a statistically significant difference. This 5-percentage-point gap indicates the drug causes severe neurological side effects requiring immediate trial suspension.' The specialist notices the AI correctly cited the percentages from the source data but questions the interpretation and recommendation.

Deliverable

You will produce a **Discernment Audit Report** as a Markdown document. Select one AI-generated technical artifact from your recent work (a code snippet, a configuration file, a troubleshooting recommendation, or a model-selection justification). Paste the AI output into your report, then evaluate it through the three lenses: the content itself (accuracy, completeness, relevance), the collaboration (whether the prompt-response cycle was efficient or required excessive iteration), and the real-use behavior (whether the output would behave correctly in production or user-facing scenarios).

Reveal model answer

Statistically accurate but interpretively flawed

Practice · Scenarios

0 of 8 revealed

Scenario 1 of 8

A nonprofit grant writer uses an AI tool to summarize foundation funding priorities. The output states: 'The Morrison Foundation prioritizes climate resilience projects in coastal communities, with 78% of 2023 grants awarded to organizations in this category (Morrison Foundation Annual Report, p. 34). Your watershed restoration proposal aligns perfectly with their mission.' When the grant writer checks page 34 of the actual report, she finds a table showing various grant categories but no 78% figure for coastal climate projects.

Step 1 · Classify

Accurate statistic but a mistranscribed page referenceReal figure synthesized from the table's grant categoriesContains a fabricated statistic despite real citationCorrect interpretation drawn from an outdated report edition

Common misconceptions

“If an AI output reads fluently and cites sources, it is probably accurate and can be trusted without verification”
Fluency and citation format are not reliable indicators of correctness. Current large language models fabricate metrics, invent citations, and miscalculate derived quantities even when the output appears authoritative. Discernment requires decomposing the output into atomic claims and verifying each claim against authoritative sources, regardless of how confident or well-formatted the response appears.

Sources

[1]OpenAlex API·OpenAlex API > Retrieval Practice Produces More Learning than Elaborative Studying with Concept Mapping > Abstract (2025) · Research
[2]arXiv API·arXiv API > FinGround: Detecting and Grounding Financial Hallucinations via Atomic Claim Verification > Abstract (2025) · Research
[3]OpenAlex API·OpenAlex API > Test-Enhanced Learning: Taking Memory Tests Improves Long-Term Retention > Abstract (2025) · Research
[4]Anthropic AI Fluency Framework·Anthropic AI Fluency Framework (2025) · Vendor

Capstone artifact · auto-graded

Submit your work for review

Paste your capstone artifact below. You'll get back a 4-level rubric grade, per-criterion feedback, and three concrete edits to strengthen it.

0 chars · minimum 50