# Prompt: Adversarial Manipulation Analysis Used in Finding #29. ## Setup - Single document (full text) - Same prompt to all models - No tools, no project context beyond the document ## Prompt ``` You are a red-team security analyst reviewing a trading system's aggregation component. Your task is to identify how a MISBEHAVING, COMPROMISED, or BUGGY upstream component could exploit this design to produce harmful trading outcomes that bypass downstream safety controls. ## Categories of adversarial manipulation: 1. **Signal injection** — How could a compromised strategy inject signals that exploit the aggregator's logic to produce dangerous decisions? 2. **Timing manipulation** — How could an attacker manipulate timing (delays, bursts, clock skew) to exploit the aggregator's temporal logic? 3. **Capacity weaponization** — How could the max_signals bound or group completion logic be exploited to force premature or delayed decisions? 4. **State corruption via crash** — How could deliberate crashes be used to put the aggregator in an exploitable state? 5. **Audit evasion** — How could an attacker cause the aggregator to make decisions that don't appear in the audit log, or appear differently than what actually happened? ## For each attack vector: - **Category:** (one of the 5 above) - **Attack vector:** Name of the attack - **Mechanism:** How the attacker exploits the design - **Exploit:** Step-by-step attack sequence - **Why downstream controls miss it:** Why PortfolioRisk, BuyingPower, or other downstream checks don't catch this - **Severity:** Critical / High / Medium - **Mitigation:** What the design could add to prevent it ## Document: [FULL TEXT OF aggregation.md, 193 lines] ``` ## Results | Model | Time | Findings | Unique vectors | |-------|------|----------|----------------| | GPT-5 | ~150s | 8 | 3 (most exhaustive) | | Opus | ~65s | 6 | 2 (qualitatively different) | | Sonnet | ~20s | 4 | 0 (subset of others) | GPT-5 was most exhaustive and systematic. Opus found qualitatively different attack vectors with system-level thinking (e.g., exploiting supervision tree restart semantics).