TPL-2026-015·preprint·2026-04-30

Provisional Patent Draft Accuracy: Measured Rework Rates vs Human Baseline

TruPath Labs Research · TruPath Ventures · Stanley, NC

patentslegal-aidraftingqccase-study

Abstract

LLM-assisted provisional patent drafting promises faster cycle time at the inventor-startup phase, but the question for operators is how much of the LLM draft survives attorney review. We ran a paired-draft protocol on a single provisional patent application — Quantum Caddy's smart-board scoring system — comparing an LLM-generated draft (Cipher agent, Claude Code) against a senior IP-attorney baseline. Rework rate, measured as percent of words materially edited or replaced before attorney sign-off, was 31% on independent claims, 18% on the abstract, 47% on the prior-art comparison section, and 22% on figures-and-drawings descriptions. Time-to-final-draft favored the LLM-assisted path by roughly 3.4× on background and embodiment sections, and by 1.6× on claims (where attorney review absorbed most of the savings). The failure modes the LLM introduced are non-random and clustered into six recurring categories. n is one application; this paper is positioned as a case-study contribution rather than a population estimate, and we are explicit about which numbers are illustrative versus measured.

1. Introduction

Provisional patent applications are the natural first IP artifact for an early-stage hardware startup. They establish a priority date, are inexpensive relative to non-provisionals, and do not require formal claims to be filed — but in practice, savvy filers include claim-style language anyway, because the disclosure that supports the eventual non-provisional must already exist on the priority date [1][2]. The drafting load on a founder-engineer is non-trivial: 10-30 pages of structured technical writing across abstract, background, embodiments, drawings, prior-art comparison, and at least an outline of independent and dependent claims [3].

Recent empirical work has begun to characterize how well large language models perform on legal-drafting tasks of this shape. GPT-4 passes the bar exam at the 90th percentile [6]; controlled studies in legal analysis show meaningful AI-assisted productivity gains with attorney supervision [5]; an early empirical pilot of LLM use in patent prosecution suggests the technology is plausibly useful but produces characteristic error classes [4]. What this literature does not yet supply, for the operator, is a numbers-on-the-page answer to a simple question: how much of the LLM’s draft survives senior attorney review?

We ran a paired-draft protocol on a single provisional application — the Quantum Caddy smart-board scoring system, an AR-mediated cornhole tracker with a sensor-fusion claim stack [9] — and measured rework at the word level across six section types. The contribution is a case-study data point with a transparent methodology that other operators can replicate. We are explicit throughout about what is measured (rework on this draft, by this attorney, on this invention) and what is illustrative (the attorney-only baseline, drawn from quoted historical averages rather than a controlled re-draft).

Subscribers only · continued

The rest of TPL-2026-015 is for subscribers.

Provisional Patent Draft Accuracy: Measured Rework Rates vs Human Baseline

Every Expert-tier lesson — diagnostic prompts, transcripts, prompt kits, full homework
Every research paper — methodology, figures, tables, reproducibility appendices
New Expert lessons + papers as they ship (quarterly cadence)
Foundations + Operating lessons stay free; bundles on GitHub stay free; this tier is the deep stuff

Become a subscriber — free →Already a subscriber? Sign in

Free while the early catalog ships. Paid tier comes later — subscribe now and you’re grandfathered in.