Files
model-research/findings/2026-05-09-61-regulatory-completeness-analysis.md

5.3 KiB

Finding #61: Regulatory Completeness Analysis

Date: 2026-05-09 Lens: Regulatory completeness analysis — does the document correctly implement referenced regulatory requirements? Document: gargoyle wash-sale-tracking.md (159 lines)

Task

Analyze wash sale tracking design document for regulatory compliance gaps — does the implementation correctly capture all requirements from IRC §1091, Treasury Regulations §1.1091-1, and IRS Publication 550?

Results

Model Output tokens Reasoning tokens Findings
GPT-5 10,460 8,832 10
Claude Opus 4.5 2,227 (internal) 11
Claude Sonnet 4 1,276 (internal) 10

Common Ground (all 3 identified)

  • Cross-account/IRA wash sale detection missing (CRITICAL) — IRC §1091 applies across all taxpayer accounts; IRA losses are permanently disallowed, not deferred
  • Substantially identical definition too narrow — misses ADRs vs ordinary shares, preferred vs common, etc.
  • Trade date vs settlement date ambiguity — IRS uses trade date for 61-day window
  • Multiple replacement lots allocation unclear — no FIFO or ordering rule specified
  • 1099-B reconciliation requirements missing — broker vs platform calculations may differ

GPT-5 Unique Findings

  • Pairwise detection over-disallowance: Share-level ledger needed; current pairwise model can disallow same loss multiple times across multiple replacement lots
  • Lot-level vs share-level adjustments: Basis and holding period adjustments described at lot level, not share level; partial overlap would incorrectly adjust entire replacement lot
  • Corporate action false positives: Splits/dividends creating "new lots" would trigger false wash sales (these aren't purchases under IRC §1091)
  • Short sale wash sale rules: Window measured differently for shorts (30 days before short sale through 30 days after close); document silent on shorts
  • Pre-sale/post-sale allocation determinism: No deterministic rule for allocating disallowed loss when both pre-sale and post-sale purchases exist

Opus Unique Findings

  • IRA loss permanently lost: Uniquely emphasized that IRA wash sales don't just defer losses — the loss is permanently unrecoverable (can't add to IRA basis). Document's unconditional basis adjustment would mislead users.
  • Option exercise/assignment as purchases: Treasury Reg §1.1091-1(a) explicitly includes "contract or option" acquisitions; exercising a call or being assigned on a short put is a purchase for wash sale purposes
  • Merger continuity: Company A → Company B reorganization may leave A and B "substantially identical"; stable instrument identifier approach may miss this
  • Chained wash sales: Replacement lot sold at a loss triggering another wash sale with third lot — holding period chains through multiple replacements

Sonnet Unique Findings

  • Constructive ownership rules: Treasury Reg §1.1091-1(c) covers related parties (spouse, controlled corps, partnerships) — missing from design
  • Stock rights and dividends: Treasury Reg §1.1091-1(g) addresses how these affect "substantially identical" and basis calculations
  • Dealer exception: Treasury Reg §1.1091-1(b) exempts securities dealers — out of scope but not documented

Key Insights

GPT-5's Exhaustive Regulatory Enumeration

GPT-5's 8,832 reasoning tokens enabled systematic cross-referencing between document claims and IRC/Treasury Reg sections. It explicitly enumerated pairwise allocation edge cases that would cause mathematical over-disallowance. This is the same exhaustive enumeration pattern seen in Finding #58 (state machine completeness) and Finding #59 (convention rule gaps).

Opus Traces Regulatory Gaps to Operational Consequences

Opus uniquely emphasized the permanent nature of IRA wash sale loss disallowance and traced option exercise scenarios through to incorrect tax reporting. Consistent with previous findings, Opus finds what the document can't see about itself — the gap between "deferral" language and the IRA case where deferral doesn't apply.

Sonnet Finds Structural Regulatory Categories

Sonnet uniquely identified constructive ownership rules (Treasury Reg §1.1091-1(c)) as a missing category — a structural gap in the regulatory coverage. However, Sonnet didn't trace this to specific failure modes like the other models.

Task Taxonomy Update

Regulatory completeness analysis → GPT-5 for exhaustive IRC/CFR cross-referencing, Opus for tracing gaps to operational consequences, Sonnet for structural category identification

This lens is distinct from other document analysis types:

  • State machine completeness (#58) tests transition coverage
  • Convention rule gaps (#59) tests specification consistency
  • Event ordering (#60) tests temporal failure modes
  • Regulatory completeness tests legal/regulatory implementation correctness

Practical Implication

For regulatory compliance analysis of financial systems, use GPT-5 for exhaustive regulation cross-referencing, then Opus to trace gaps to operational/legal consequences. Sonnet provides efficient structural overview but insufficient depth for compliance work.

Efficiency

  • GPT-5: 1,046 tokens/finding (verbose but exhaustive)
  • Opus: 202 tokens/finding (good detail-to-length ratio)
  • Sonnet: 128 tokens/finding (efficient but surface-level)