Question 1

Does DocketDrift hallucinate cases?

Accepted Answer

No. Generative AI legal tools (Lexis+ AI, Westlaw AI-Assisted Research, CoCounsel, Harvey, and others built on large language models) have been documented to fabricate case citations at rates of 17 to 33 percent even with retrieval-augmentation. That failure mode is structurally impossible on DocketDrift because the system cannot produce any new text. Every record traces to an opinion the public can verify against the official source.

Question 2

Where does machine learning appear in DocketDrift?

Accepted Answer

Two places, both narrow. (1) Voyage embeddings for semantic search: a 1024-dimension vector representation of each opinion. The system compares a query vector to opinion vectors with cosine similarity and returns an ordered list of opinion IDs. No text is generated. (2) Tag-suggestion candidates: embeddings rank candidate tags by similarity to each opinion. Above a high-confidence threshold the tag is auto-applied and marked AUTO_APPLIED for transparent audit. Below that threshold the suggestion appears in a human-review queue; the editor accepts or rejects. Nothing low-confidence becomes a published tag.

Question 3

How is everything else (case number, disposition, panel) extracted?

Accepted Answer

Deterministic regex over the actual published text. Either the pattern matches or it doesn't. There is no LLM in any of those paths, and no LLM is ever asked to synthesize, summarize, or describe what it sees.

Question 4

Why is hallucination architecturally impossible here?

Accepted Answer

The system cannot produce a fake case citation because the system cannot produce any text. Every record traces to an opinion the public can verify against the official source.

What we don’t do	What we do
Generate legal analysis	Index real, published opinions
Draft briefs, memos, contracts	Link to the source URL of every record
Answer “what’s the holding in…?”	Show you the actual opinion text
Summarize cases into prose	Pull verbatim surrounding text around statute citations
Predict outcomes or judge behavior	Count actual prior outcomes (counts only, no narrative)
Synthesize holdings	Color-code the disposition that’s literally printed in the opinion
Chat / answer questions	Provide a tag-suggestion queue a human editor must accept or reject

Explore

Outcomes

How DocketDrift differs from AI legal tools

Where ML appears at all

Two practical consequences