Deep Research Playbooks

Use this three-phase workflow to produce high-trust research that survives scrutiny.

Why most AI research outputs fail

AI tools can produce a 40-page research brief in under 10 minutes. That is not the problem. The problem is that the brief usually mixes strong sources with weak ones, confident claims with extrapolations, and current data with training-cutoff data — all formatted with identical authority. When a stakeholder asks "how do we know this is true?" the answer is often "the model said so."

Deep research playbooks fix this by forcing the work into three separated phases: discovery, synthesis, and verification. Each phase produces its own artifact with its own quality bar. You never let the model jump from "find sources" to "deliver recommendation" in one shot, because that is the exact path where hallucinations and overreach slip through.

The three-phase model is boring on purpose. Boring processes survive scrutiny. Interesting processes produce outputs that look great in a deck and fall apart in a board meeting.

Discovery pass

Map the landscape fast without pretending certainty.

Task: [clear question]
Scope: [time window + geography + industry]
Output: 10-source map with confidence tags and contradictions.

Synthesis pass

Convert scattered sources into a decision narrative.

Input: ranked sources from discovery
Output: thesis, counter-thesis, unknowns, and actionable options.

Verification pass

Stress-test claims before publication or decision.

For each key claim: source, evidence strength, failure mode, check owner, due date.

How to run each phase

Phase 1: Discovery

The goal is breadth, not certainty. You are mapping the landscape to find sources, perspectives, and contradictions worth investigating. Let the model be a research librarian — not a thesis writer. Ask for 10 to 15 sources with explicit confidence tags (primary, secondary, speculative) and call out where sources disagree. If every source agrees, treat that as a signal that the search was too narrow.

Phase 2: Synthesis

This is where you convert the sources into a decision. Force the model to write a thesis and a counter-thesis, then list unknowns and trade-offs. If the synthesis only produces one perspective, it has not done the job. Decision-grade output includes what would need to be true for you to change your mind.

Phase 3: Verification

Every high-impact claim gets a named human owner, a source link, and a due date for final check. This is the single phase that kills hallucinations — because when a human name is attached to a claim, the model loses the option to be vague.

Operating rule

Never publish a synthesis output unless every high-impact claim has a verification owner and timestamp.

Common failure modes

Collapsing phases into one prompt: asking the model to "research and recommend" produces confident-sounding recommendations backed by thin sources.
No named owner for verification: "the AI checked it" is not verification. A human name on each claim forces accountability.
Confusing training data with live data: models answer confidently from stale knowledge. Ask for date-stamped sources or force a live-search tool into the loop.
No counter-thesis: if the synthesis presents one narrative, the model has pattern-matched, not reasoned.

FAQ

Which model should I use for deep research?

Use a model with live search or agent-browser capability for discovery, and a strong reasoning model for synthesis. Deep Research modes in ChatGPT, Claude, Gemini, and Perplexity are designed for this workflow. The choice matters less than enforcing the three-phase separation.

How long should each phase take?

Discovery is usually 15 to 30 minutes. Synthesis takes 30 to 60 minutes if you're pushing for a decision-grade narrative. Verification depends on claim count — budget 5 minutes per high-impact claim. Rushed verification is the single most common quality failure.

Can one person run all three phases?

Yes, but separate the sessions. Different prompt context, different output format, ideally different calendar blocks. Running all three in one continuous session makes it easy to skip verification because the synthesis already "feels" done.