Compare commits

...

34 Commits

Author SHA1 Message Date
Rodin ea74f7e088 ci: use rodin/security-patterns with '.' path for security reviewer
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 23s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 46s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 52s
Tests the dot path normalization fix end-to-end.
2026-05-11 07:12:25 -07:00
Rodin e6b1840ffc fix(gitea): normalize "." path to empty string in ListContents
CI / test (pull_request) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 26s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 34s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 43s
Gitea API rejects "." with HTTP 500 (malformed path component).
When patterns-files is set to ".", normalize it to empty string
before making the API call.

Fixes #70
2026-05-11 06:24:05 -07:00
aweiker 1ca9250e4a Merge pull request 'feat(gitea): add retry logic for 5xx errors' (#69) from issue-68 into main
CI / test (push) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #69
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
Reviewed-by: Aaron Weiker <aaron@weiker.org>
2026-05-11 12:59:50 +00:00
Rodin 1b38e6ad00 fix(gitea): ensure consistent lastErr return for network errors
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 35s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 58s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m37s
Move lastErr assignment outside the retry condition so that both
network errors and HTTP 5xx paths return lastErr consistently.

Previously, on the final retry attempt, a network error would return
the raw err variable instead of lastErr. While they held the same
value in practice, the inconsistency was confusing when reading the
code.

Now both paths:
- Network errors: assign lastErr before checking retry, return lastErr
- HTTP 5xx: assign lastErr before checking retry, return lastErr

Addresses review finding #3 (MINOR) from sonnet review on PR #69.
2026-05-11 05:31:39 -07:00
Rodin 5498dccd60 fix(gitea): address MINOR review findings on retry logic
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 39s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 50s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m28s
1. Fix non-deterministic test TestDoGet_RetriesOnTemporaryNetError:
   - Replace timing-dependent listener approach with mockTransport
   - mockTransport allows controlled injection of net.OpError failures
   - Test now makes deterministic assertions: exactly 3 attempts (2 fail + 1 success)
   - Added SetHTTPClient() method for test transport injection

2. Sanitize error content in retry warning logs:
   - Added sanitizeErrorForLog() helper that omits response body content
   - For APIError: logs only 'HTTP <status>' instead of full body
   - For other errors: preserves error type information
   - Addresses security concern about logging server error content at WARN level
   - Full error with body still returned to caller for proper error handling

Both changes have corresponding test coverage.
2026-05-11 05:15:07 -07:00
Aaron Weiker ecfbfddc7c fix(gitea): redact query params from retry warning logs
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 34s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m14s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m17s
Addresses security review finding: retry warnings were logging the full
request URL which could inadvertently leak sensitive query parameters
if future callers pass them.

Added redactURL() helper that:
- Strips query parameters from URLs before logging (replaces with [redacted])
- Returns [invalid URL] for unparseable URLs to avoid leaking any data
- Preserves the base path for debugging context

The error itself (lastErr) is kept as-is since APIError.Error() already
truncates response bodies to 200 chars, and network errors don't contain
user-controlled data.
2026-05-11 04:52:41 -07:00
Rodin ac53ecfa5d fix(gitea): improve retry logic precision for net.OpError
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 46s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m11s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m23s
Address review feedback on isTemporaryNetError being too broad:

1. RetryBackoff field: Added doc comment clarifying it must be
   configured before the first request (addresses concurrency concern).

2. isTemporaryNetError: Now inspects the underlying syscall error
   instead of treating all net.OpError as retriable. Only retries on:
   - ECONNREFUSED (connection refused)
   - ECONNRESET (connection reset)
   - ENETUNREACH (network unreachable)
   - EHOSTUNREACH (host unreachable)
   - ETIMEDOUT (connection timed out)

   Permanent errors like EACCES, EPERM are no longer retried.

3. DNS errors: Changed from Temporary() to IsTimeout, since
   "no such host" is permanent and shouldn't be retried.

4. Empty backoff slice: Added comment explaining that retry without
   delay is intentional when caller explicitly configures it.

Addresses MINOR findings from sonnet-review-bot and gpt-review-bot.
2026-05-11 04:32:15 -07:00
Rodin 090ae3848c fix(gitea): make retry backoff configurable and retry temp net errors
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 34s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 45s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m34s
Address review feedback:

1. Make backoff delays injectable via Client.RetryBackoff field
   - Defaults to {1s, 2s} when nil for production
   - Tests can set shorter values for fast execution
   - Fixes slow unit tests that previously waited 3+ seconds

2. Add retry on temporary network errors (net.OpError, net.DNSError)
   - Connection refused, network unreachable, DNS failures now retry
   - Non-temporary network errors still fail immediately
   - Context cancellation still respected during backoff

Added isTemporaryNetError helper and TestIsTemporaryNetError test.
Updated existing retry tests to use configurable short backoffs.
2026-05-11 04:23:27 -07:00
Rodin 23da7eedf5 fix(gitea): address review feedback on retry logic
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 23s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 30s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 45s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m30s
- Remove dead backoff[0] element; array now only contains retry delays
- Fix time.After timer leak by using time.NewTimer with timer.Stop()
- Add io.LimitReader (64KB) for error body reads to bound memory allocation

Addresses feedback from sonnet-review-bot, security-review-bot, and gpt-review-bot.
2026-05-11 01:08:01 -07:00
Rodin 7279cdd216 feat(gitea): add retry logic for 5xx errors in doGet
CI / test (pull_request) Successful in 28s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 30s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m27s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m33s
Addresses transient HTTP 500 errors from Gitea API during pattern fetches.

Changes:
- Add retry with exponential backoff (1s, 2s) to doGet(), max 3 attempts
- Add IsServerError() helper to detect 5xx responses
- No retry on 4xx errors (client errors should propagate immediately)
- Respects context cancellation during backoff waits
- Logs retries at WARN level for observability

All existing tests pass. New tests:
- TestIsServerError: validates 5xx detection across edge cases
- TestDoGet_RetriesOn500: verifies recovery after transient errors
- TestDoGet_FailsAfterMaxRetries: verifies proper failure after exhaustion
- TestDoGet_NoRetryOn4xx: ensures client errors don't retry
- TestDoGet_RespectsContextCancellation: validates cancellation during backoff

Closes #68
2026-05-11 00:59:47 -07:00
aweiker 4c327b61d4 Merge pull request 'feat: log loaded pattern files for debugging' (#65) from issue-64 into main
CI / test (push) Successful in 15s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #65
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
2026-05-11 06:53:00 +00:00
aweiker 877dbf9999 Merge pull request 'docs: add read:user to required token scopes' (#67) from issue-66 into main
CI / test (push) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #67
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
2026-05-11 06:52:21 +00:00
Rodin 4a1cb6b47c docs: add read:user to required token scopes
CI / test (pull_request) Successful in 13s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 20s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 27s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 50s
The read:user scope is needed for the bot to self-request as a
reviewer on PRs. Without it, the bot still functions but cannot
add itself to the reviewer list.

Closes #66
2026-05-10 23:40:24 -07:00
Rodin deade3c5a0 feat: log loaded pattern files for debugging
CI / test (pull_request) Successful in 19s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 29s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 42s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m5s
When patterns-repo is configured, now logs at Info level:
- File paths loaded from each repo
- Count of files per repo

At Debug level logs skipped files (non-markdown/txt/yaml).

Warns if no pattern files were loaded from a repo (likely
misconfigured patterns-files path).

Closes #64
2026-05-10 23:37:14 -07:00
aweiker c54cee134e Merge pull request 'feat: load personas from target repo .review-bot/personas/' (#61) from issue-60 into main
CI / test (push) Successful in 9m33s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #61
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
Reviewed-by: Aaron Weiker <aaron@weiker.org>
2026-05-11 02:54:46 +00:00
Rodin 1dd73bc4df Revert "ci: disable setup-go cache (cache server unreachable)"
PR Ready Gate / clear-labels (pull_request) Successful in 3s
CI / test (pull_request) Successful in 9m32s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Failing after 5m21s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Failing after 5m51s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Failing after 5m58s
This reverts commit 8f564ea4f8.
2026-05-10 19:44:08 -07:00
Rodin 8f564ea4f8 ci: disable setup-go cache (cache server unreachable)
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 15s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 36s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m28s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m43s
Cache server at 192.168.109.55:35239 times out, adding 4+ minutes
to each job. Disable until cache infra is fixed.
2026-05-10 19:43:46 -07:00
Rodin 9775cb098c fix: address PR #61 review findings
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 9m32s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 9m55s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 10m38s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 11m3s
MAJOR:
- LoadRepoPersonas: add MaxPersonaFileSize check before parsing to
  prevent resource exhaustion from oversized YAML files committed
  to target repositories

MINOR:
- isNotFoundError: tighten substring match to 'HTTP 404' only to
  avoid masking auth/transport errors containing generic 'not found'
- main.go: remove duplicate flag.Parse() call
- main.go: add comment explaining nil map indexing is safe in Go
  when LoadRepoPersonas returns an error

Tests updated to reflect the intentional behavior change in
isNotFoundError and added test case for oversized file rejection.
2026-05-10 19:29:13 -07:00
Rodin 3f06ba2ea6 feat: load personas from target repo .review-bot/personas/
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 9m32s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 10m10s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 10m51s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 10m33s
Implements #60.

- Add ParsePersonaBytes() for parsing personas from byte data
- Add LoadRepoPersonas() to fetch personas from repo via Gitea API
- Add MergePersonas() to combine built-in and repo personas
- Add GetBuiltinPersonasMap() helper
- Update main.go to load repo personas first, fall back to built-in
- Add giteaClientAdapter to bridge gitea.Client to review.GiteaClient

When --persona is specified, the bot now:
1. Attempts to fetch personas from .review-bot/personas/*.yaml
2. If the named persona exists in the repo, uses it
3. Otherwise falls back to built-in personas

This allows repos to define domain-specific personas (e.g., trading
experts for gargoyle, crypto experts for kms-lite) without modifying
the review-bot codebase.
2026-05-10 19:05:37 -07:00
aweiker 593b249e09 Merge pull request 'feat: add YAML support for persona files' (#58) from issue-57 into main
CI / test (push) Successful in 9m31s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #58
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
Reviewed-by: Aaron Weiker <aaron@weiker.org>
2026-05-11 01:39:43 +00:00
Rodin 10cd6203d4 fix: address remaining PR #58 review findings
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 9m31s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 9m54s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 10m40s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 11m27s
1. Remove dead JSON fallback in LoadBuiltinPersona
   - The embed directive only includes *.yaml files
   - JSON fallback code could never succeed
   - Simplified function to only try YAML

2. JSON parsing now rejects unknown fields
   - Switched from json.Unmarshal to json.Decoder
   - DisallowUnknownFields() matches YAML's KnownFields(true)
   - Added test coverage for JSON unknown field rejection

3. Documented symlink support in LoadPersona
   - os.Stat follows symlinks, so symlinks to regular files work
   - Added doc comment explaining the behavior
   - Added test for symlink support
2026-05-10 17:53:42 -07:00
Aaron Weiker 26f326cf51 fix: add YAML alias cycle detection and multi-document rejection
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 9m34s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 9m53s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 10m23s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 11m24s
Address security review findings:

MAJOR: Add cycle detection to checkYAMLDepth using a visited set
(seen map[*yaml.Node]struct{}) to prevent infinite recursion from
crafted YAML with self-referential aliases.

MINOR fixes:
- Add MaxYAMLNodes (1000) limit as defense-in-depth against
  wide-but-shallow structures that bypass depth limits
- Increment depth when following alias targets (was incorrectly
  passing same depth, allowing alias chains to bypass depth limit)
- Reject multi-document YAML files instead of silently ignoring
  additional documents (prevents confusing silent data loss)

Tests added:
- TestYAMLAliasCycleDetection: Direct test of cycle detection logic
- TestYAMLMultiDocumentRejection: Verifies multi-doc files rejected
- TestYAMLNodeCountLimit: Verifies wide structures are rejected
- TestCheckYAMLDepthCycleDetectionDirect: Unit test with artificial cycle
2026-05-10 17:12:01 -07:00
Rodin 4fed59ac85 yaml: enable strict field checking to catch typos
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 9m33s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 9m54s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 10m51s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 11m30s
Addresses PR #58 MINOR finding: YAML decoder now rejects unknown fields.

- Enable KnownFields(true) on YAML decoder to catch typos like
  'focuss' or 'identiy' in persona files
- Since yaml.Node.Decode() doesn't support KnownFields, we now
  do a two-pass decode: first pass checks depth limits, second
  pass decodes with strict field checking
- Add tests for unknown field rejection at top-level and nested levels
2026-05-10 16:50:07 -07:00
Rodin 6035afeea7 fix: address MINOR review findings from c3e8f0f review
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 9m33s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 9m51s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 11m13s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 11m25s
2026-05-10 16:29:44 -07:00
Rodin c3e8f0f231 fix: address PR review findings
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 9m32s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 9m53s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 10m52s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 11m0s
MAJOR fixes:
- Remove false security claim about gopkg.in/yaml.v3 having built-in depth protection
- Add explicit YAML depth limiting via yaml.Node API (MaxYAMLDepth=20)
- Add file size limit for persona files (MaxPersonaFileSize=64KB)
- Add test for deeply nested YAML rejection

MINOR fixes:
- Add sort.Strings to ListBuiltinPersonas for deterministic ordering
- Update design doc to reflect actual library used (gopkg.in/yaml.v3)
- Update README: 'Zero dependencies' → 'Minimal dependencies'
- Add test for file size limit
- Add test for sorted persona list
2026-05-10 14:43:31 -07:00
Rodin 7898dd939f feat: add YAML support for persona files (#57)
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 9m33s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 9m55s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 10m32s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 11m0s
- Add gopkg.in/yaml.v3 dependency (approved in CONVENTIONS.md)
- Update parsePersona to detect format by file extension
- Support both .yaml and .yml extensions (case-insensitive)
- Convert built-in personas to YAML format
- Add comprehensive tests for YAML parsing
- Update README with YAML examples and documentation

YAML provides cleaner multi-line strings via literal block scalars
and supports comments, making persona definitions more readable.
JSON remains supported for backwards compatibility.

Closes #57
2026-05-10 14:16:41 -07:00
rodin fededd18ad Merge pull request 'docs: allow approved third-party packages' (#59) from allow-deps into main
CI / test (push) Successful in 15s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
docs: strict dependency allowlist with CI enforcement
2026-05-10 21:07:10 +00:00
Rodin 01cde16d47 fix: validate all deps and improve robustness
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 15s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 34s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m51s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m51s
Addresses GPT review feedback:

1. MAJOR - Test deps now validated: All direct module deps (from go.mod)
   are checked against the allowlist, whether used in prod or tests.

2. MINOR - Prefix match: Uses grep -E with word boundary (^pkg(/|$|$))
   to avoid false positives on similarly-prefixed modules.

3. MINOR - Bash version check: Script now fails early with helpful
   message if Bash < 4 (macOS default). Added shebang: #!/usr/bin/env bash

4. NIT - Removed redundant grep -v '_test' (go list -deps already
   excludes test-only deps without -test flag).
2026-05-10 14:02:06 -07:00
Rodin aeb0c8cb79 fix: enforce Scope column and improve portability
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 14s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 35s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m46s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m2s
Addresses review feedback:

1. MAJOR - Scope enforcement: Script now parses the Scope column and
   ensures 'test only' packages don't appear in non-test code. Uses
   'go list -deps' to check production imports.

2. MINOR - Portability: Replaced 'grep -P' (GNU-only) with awk-based
   parsing that works on macOS/BSD.

3. MINOR - Robustness: Table parsing uses awk to split on '|' and
   extract columns properly, handling whitespace variations.

4. MINOR - Glob safety: Prefix matching now uses parameter expansion
   instead of glob patterns to prevent metacharacter issues.
2026-05-10 13:57:49 -07:00
Rodin 70267b68f4 fix: address review feedback on dependency allowlist
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 14s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 32s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m27s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m28s
Fixes:
- Single source of truth: script now parses allowlist from CONVENTIONS.md
- Fail closed: script exits non-zero if 'go list' fails
- Direct deps only: uses '-f' flag to exclude transitive deps
- Added 'precommit' to .PHONY in Makefile
- Removed unused ALLOWED_PATTERN variable
- Added Scope column to distinguish test-only vs production deps
- Clarified that transitive deps of approved packages are allowed
- Added note that enforcement script parses the table
2026-05-10 13:53:55 -07:00
Rodin 4b96231b32 docs: strict dependency allowlist with CI enforcement
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 15s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 28s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m40s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m48s
STRICT ALLOWLIST policy: Only packages explicitly listed in CONVENTIONS.md
may be imported. No exceptions.

## Changes

- Updates CONVENTIONS.md with strict allowlist language
- Adds scripts/check-deps.sh to enforce the allowlist
- Adds 'make check-deps' and 'make precommit' targets
- CI will fail if any unapproved dependency is detected

## Approved packages

- gopkg.in/yaml.v3 — YAML parsing
- github.com/google/go-cmp — test comparisons

## Process for new dependencies

1. Open a PR that ONLY updates CONVENTIONS.md
2. Requires explicit approval from Aaron
3. After merge, a separate PR may use the package
2026-05-10 13:45:12 -07:00
rodin 230419f0e2 Merge pull request 'feat: native SAP AI Core support' (#54) from feat/aicore-provider-v2 into main
CI / test (push) Successful in 14s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
feat: native SAP AI Core support (#54)
2026-05-10 20:03:32 +00:00
Rodin 7dab35de41 feat: native SAP AI Core support
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 14s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 35s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m30s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 2m0s
Add native SAP AI Core provider that handles OAuth token management and
deployment discovery automatically. This eliminates the need for the
external LLM proxy when running in SAP environments.

Changes:
- Add AICoreClient with OAuth token caching and deployment URL discovery
- Support both Anthropic and OpenAI models via AI Core deployments
- Update CI to use native AI Core provider
- Update action inputs to accept AI Core credentials
- Update README with AI Core configuration examples

Model names must match AI Core deployment names (e.g. anthropic--claude-4.6-sonnet, gpt-5).
2026-05-10 10:25:10 -07:00
aweiker c41c9590b7 Merge pull request 'feat(persona): add role-based review personas' (#55) from issue-51 into main
CI / test (push) Successful in 12s
CI / review (/anthropic/v1, anthropic--claude-4.6-sonnet, sonnet, anthropic, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (/openai/v1, gpt-5, gpt, openai, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (/openai/v1, gpt-5, security, openai, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #55
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
Reviewed-by: Aaron Weiker <aaron@weiker.org>
2026-05-10 17:16:10 +00:00
25 changed files with 3464 additions and 219 deletions
+32 -5
View File
@@ -26,18 +26,40 @@ inputs:
required: false required: false
default: '' default: ''
llm-base-url: llm-base-url:
description: 'OpenAI-compatible LLM API base URL' description: 'OpenAI-compatible LLM API base URL (not required for aicore provider)'
required: true required: false
default: ''
llm-api-key: llm-api-key:
description: 'LLM API key' description: 'LLM API key (not required for aicore provider)'
required: true required: false
default: ''
llm-model: llm-model:
description: 'LLM model name' description: 'LLM model name'
required: true required: true
llm-provider: llm-provider:
description: 'LLM API provider: openai or anthropic (default openai)' description: 'LLM API provider: openai, anthropic, or aicore (default openai)'
required: false required: false
default: 'openai' default: 'openai'
aicore-client-id:
description: 'SAP AI Core client ID (required for aicore provider)'
required: false
default: ''
aicore-client-secret:
description: 'SAP AI Core client secret (required for aicore provider)'
required: false
default: ''
aicore-auth-url:
description: 'SAP AI Core authentication URL (required for aicore provider)'
required: false
default: ''
aicore-api-url:
description: 'SAP AI Core API URL (required for aicore provider)'
required: false
default: ''
aicore-resource-group:
description: 'SAP AI Core resource group (default: default)'
required: false
default: 'default'
conventions-file: conventions-file:
description: 'Path to conventions file in the repo (e.g. CLAUDE.md)' description: 'Path to conventions file in the repo (e.g. CLAUDE.md)'
required: false required: false
@@ -165,6 +187,11 @@ runs:
SYSTEM_PROMPT_FILE: ${{ inputs.system-prompt-file }} SYSTEM_PROMPT_FILE: ${{ inputs.system-prompt-file }}
PERSONA: ${{ inputs.persona }} PERSONA: ${{ inputs.persona }}
PERSONA_FILE: ${{ inputs.persona-file }} PERSONA_FILE: ${{ inputs.persona-file }}
AICORE_CLIENT_ID: ${{ inputs.aicore-client-id }}
AICORE_CLIENT_SECRET: ${{ inputs.aicore-client-secret }}
AICORE_AUTH_URL: ${{ inputs.aicore-auth-url }}
AICORE_API_URL: ${{ inputs.aicore-api-url }}
AICORE_RESOURCE_GROUP: ${{ inputs.aicore-resource-group }}
run: | run: |
ARGS="" ARGS=""
if [ "${{ inputs.dry-run }}" = "true" ]; then if [ "${{ inputs.dry-run }}" = "true" ]; then
+14 -13
View File
@@ -18,8 +18,10 @@ jobs:
- run: go vet ./... - run: go vet ./...
- run: go build -o review-bot ./cmd/review-bot - run: go build -o review-bot ./cmd/review-bot
# Self-review: builds from source since we're pre-release # Self-review using native SAP AI Core provider
# Models configured to match SAP AI Core deployments # Models must match SAP AI Core deployments
# Available models: gpt-5, anthropic--claude-4.6-sonnet, anthropic--claude-4.6-opus
# Removed gpt-4.1, gpt-5-mini, gpt-4.1-mini - not deployed on AI Core
review: review:
runs-on: ubuntu-24.04 runs-on: ubuntu-24.04
if: github.event_name == 'pull_request' if: github.event_name == 'pull_request'
@@ -29,19 +31,15 @@ jobs:
include: include:
- name: sonnet - name: sonnet
token_secret: SONNET_REVIEW_TOKEN token_secret: SONNET_REVIEW_TOKEN
provider: anthropic
llm_path: /anthropic/v1
model: anthropic--claude-4.6-sonnet model: anthropic--claude-4.6-sonnet
- name: gpt - name: gpt
token_secret: GPT_REVIEW_TOKEN token_secret: GPT_REVIEW_TOKEN
provider: openai
llm_path: /openai/v1
model: gpt-5 model: gpt-5
- name: security - name: security
token_secret: SECURITY_REVIEW_TOKEN token_secret: SECURITY_REVIEW_TOKEN
provider: openai
llm_path: /openai/v1
model: gpt-5 model: gpt-5
patterns_repo: rodin/security-patterns
patterns_files: "."
system_prompt_file: SECURITY_REVIEW.md system_prompt_file: SECURITY_REVIEW.md
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
@@ -56,13 +54,16 @@ jobs:
PR_NUMBER: ${{ github.event.pull_request.number }} PR_NUMBER: ${{ github.event.pull_request.number }}
REVIEWER_TOKEN: ${{ secrets[matrix.token_secret] }} REVIEWER_TOKEN: ${{ secrets[matrix.token_secret] }}
REVIEWER_NAME: ${{ matrix.name }} REVIEWER_NAME: ${{ matrix.name }}
LLM_BASE_URL: ${{ secrets.LLM_BASE_URL }}${{ matrix.llm_path }} LLM_PROVIDER: aicore
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
LLM_MODEL: ${{ matrix.model }} LLM_MODEL: ${{ matrix.model }}
LLM_PROVIDER: ${{ matrix.provider }} AICORE_CLIENT_ID: ${{ secrets.AICORE_CLIENT_ID }}
AICORE_CLIENT_SECRET: ${{ secrets.AICORE_CLIENT_SECRET }}
AICORE_AUTH_URL: ${{ secrets.AICORE_AUTH_URL }}
AICORE_API_URL: ${{ secrets.AICORE_API_URL }}
AICORE_RESOURCE_GROUP: ${{ secrets.AICORE_RESOURCE_GROUP }}
CONVENTIONS_FILE: "CONVENTIONS.md" CONVENTIONS_FILE: "CONVENTIONS.md"
PATTERNS_REPO: "rodin/go-patterns" PATTERNS_REPO: ${{ matrix.patterns_repo || 'rodin/go-patterns' }}
PATTERNS_FILES: "README.md,patterns/" PATTERNS_FILES: ${{ matrix.patterns_files || 'README.md,patterns/' }}
LLM_TIMEOUT: "600" LLM_TIMEOUT: "600"
SYSTEM_PROMPT_FILE: ${{ matrix.system_prompt_file }} SYSTEM_PROMPT_FILE: ${{ matrix.system_prompt_file }}
run: ./review-bot run: ./review-bot
+19 -1
View File
@@ -2,8 +2,26 @@
## Language & Dependencies ## Language & Dependencies
- Go standard library only — no external dependencies.
- Target the latest stable Go release. - Target the latest stable Go release.
- **STRICT ALLOWLIST:** Only packages listed below may be imported. No exceptions.
### Approved Third-Party Packages
| Package | Use Case | Scope |
|---------|----------|-------|
| `gopkg.in/yaml.v3` | YAML parsing (persona files, config) | production |
| `github.com/google/go-cmp` | Test comparisons (`cmp.Diff`) | test only |
**Any import not in this table or the Go standard library is forbidden.**
Transitive dependencies of approved packages are automatically allowed.
To request a new dependency:
1. Open a PR that ONLY updates this table
2. Requires explicit approval from Aaron
3. After merge, a separate PR may use the package
*Enforcement: `scripts/check-deps.sh` parses this table — update only here.*
## Error Handling ## Error Handling
+7 -1
View File
@@ -1,4 +1,4 @@
.PHONY: build test test-integration lint clean coverage .PHONY: build test test-integration lint clean coverage check-deps precommit
build: build:
go build -o review-bot ./cmd/review-bot/ go build -o review-bot ./cmd/review-bot/
@@ -12,9 +12,15 @@ test-integration:
lint: lint:
go vet ./... go vet ./...
check-deps:
@./scripts/check-deps.sh
clean: clean:
rm -f review-bot rm -f review-bot
coverage: coverage:
go test -coverprofile=coverage.out ./... go test -coverprofile=coverage.out ./...
go tool cover -func=coverage.out go tool cover -func=coverage.out
# Precommit runs all checks required before pushing
precommit: check-deps lint test
+77 -35
View File
@@ -4,12 +4,12 @@ AI-powered code review bot for Gitea pull requests. Fetches diff + context, send
## Features ## Features
- **Multi-provider**: OpenAI-compatible and Anthropic Messages API - **Multi-provider**: OpenAI-compatible, Anthropic Messages API, and SAP AI Core
- **Context-aware**: Fetches full file content, conventions, language patterns, CI status - **Context-aware**: Fetches full file content, conventions, language patterns, CI status
- **Smart budget**: Automatically trims context to fit model token limits - **Smart budget**: Automatically trims context to fit model token limits
- **Idempotent reviews**: Posts new review, then cleans up stale ones (one review per bot) - **Idempotent reviews**: Posts new review, then cleans up stale ones (one review per bot)
- **Custom prompts**: Load additional instructions from a file (e.g. security-focused review) - **Custom prompts**: Load additional instructions from a file (e.g. security-focused review)
- **Zero dependencies**: Go stdlib only - **Minimal dependencies**: Go stdlib + `gopkg.in/yaml.v3` only
## Quick Start: Composite Action ## Quick Start: Composite Action
@@ -168,28 +168,56 @@ Prints the review to CI logs without posting to the PR. Useful for testing promp
llm-provider: anthropic llm-provider: anthropic
``` ```
### Using SAP AI Core
For SAP environments with AI Core deployments, use the `aicore` provider for native authentication:
```yaml
- uses: https://gitea.weiker.me/rodin/review-bot/.gitea/actions/review@v0.1.0
with:
reviewer-token: ${{ secrets.REVIEW_TOKEN }}
reviewer-name: aicore-review
llm-model: anthropic--claude-4.6-sonnet # or gpt-5
llm-provider: aicore
aicore-client-id: ${{ secrets.AICORE_CLIENT_ID }}
aicore-client-secret: ${{ secrets.AICORE_CLIENT_SECRET }}
aicore-auth-url: ${{ secrets.AICORE_AUTH_URL }}
aicore-api-url: ${{ secrets.AICORE_API_URL }}
aicore-resource-group: default
```
AI Core handles OAuth token management and deployment discovery automatically. Model names must match the deployment name in AI Core (e.g. `anthropic--claude-4.6-sonnet`, `gpt-5`).
## Action Inputs ## Action Inputs
| Input | Required | Default | Description | | Input | Required | Default | Description |
|-------|----------|---------|-------------| |-------|----------|---------|-------------|
| `reviewer-token` | Yes | — | Gitea token for posting reviews (needs `write:issue`, `write:repository`) | | `reviewer-token` | Yes | — | Gitea token for posting reviews (needs `write:issue`, `write:repository`) |
| `reviewer-name` | No | `""` | Logical identity for this reviewer. Used as sentinel for idempotent cleanup. Set this when running multiple review bots on the same PR. | | `reviewer-name` | No | `""` | Logical identity for this reviewer. Used as sentinel for idempotent cleanup. Set this when running multiple review bots on the same PR. |
| `llm-base-url` | Yes | — | LLM API base URL | | `llm-base-url` | No* | `""` | LLM API base URL (required unless using aicore provider) |
| `llm-api-key` | Yes | — | LLM API key | | `llm-api-key` | No* | `""` | LLM API key (required unless using aicore provider) |
| `llm-model` | Yes | — | Model name | | `llm-model` | Yes | — | Model name |
| `llm-provider` | No | `openai` | API provider: `openai` or `anthropic` | | `llm-provider` | No | `openai` | API provider: `openai`, `anthropic`, or `aicore` |
| `aicore-client-id` | No** | `""` | SAP AI Core client ID |
| `aicore-client-secret` | No** | `""` | SAP AI Core client secret |
| `aicore-auth-url` | No** | `""` | SAP AI Core authentication URL |
| `aicore-api-url` | No** | `""` | SAP AI Core API URL |
| `aicore-resource-group` | No | `default` | SAP AI Core resource group |
| `conventions-file` | No | `""` | Path to coding conventions file in the repo | | `conventions-file` | No | `""` | Path to coding conventions file in the repo |
| `patterns-repo` | No | `""` | Comma-separated repos with language patterns (e.g. `rodin/go-patterns`) | | `patterns-repo` | No | `""` | Comma-separated repos with language patterns (e.g. `rodin/go-patterns`) |
| `patterns-files` | No | `README.md` | Files/directories to fetch from pattern repos | | `patterns-files` | No | `README.md` | Files/directories to fetch from pattern repos |
| `system-prompt-file` | No | `""` | Local file with additional system prompt instructions | | `system-prompt-file` | No | `""` | Local file with additional system prompt instructions |
| `persona` | No | `""` | Built-in persona name (security, architect, docs) | | `persona` | No | `""` | Built-in persona name (security, architect, docs) |
| `persona-file` | No | `""` | Path to persona JSON file with custom review focus | | `persona-file` | No | `""` | Path to persona file (YAML or JSON) with custom review focus |
| `temperature` | No | `0` | LLM temperature (0 = server default) | | `temperature` | No | `0` | LLM temperature (0 = server default) |
| `timeout` | No | `300` | LLM request timeout in seconds | | `timeout` | No | `300` | LLM request timeout in seconds |
| `dry-run` | No | `false` | Print review to stdout instead of posting | | `dry-run` | No | `false` | Print review to stdout instead of posting |
| `update-existing` | No | `true` | Delete previous review from same bot before posting. Accepts: true/1/yes or false/0/no | | `update-existing` | No | `true` | Delete previous review from same bot before posting. Accepts: true/1/yes or false/0/no |
| `version` | No | `latest` | review-bot version to install | | `version` | No | `latest` | review-bot version to install |
*Required for `openai` and `anthropic` providers, not for `aicore`.
**Required only for `aicore` provider.
## Runner Requirements ## Runner Requirements
The composite action requires these tools on the runner: The composite action requires these tools on the runner:
@@ -301,11 +329,12 @@ All flags have environment variable equivalents:
### Token Scopes Required ### Token Scopes Required
| Scope | Purpose | | Scope | Purpose |
|-------|---------| |-------|--------|
| `write:issue` | Post and delete reviews | | `write:issue` | Post and delete reviews |
| `write:repository` | Read PR diffs, file content, commit statuses | | `write:repository` | Read PR diffs, file content, commit statuses |
| `read:user` | Self-request as reviewer (optional but recommended) |
No `read:user` scope needed — the bot identifies itself from the review response. Without `read:user`, the bot still works but cannot add itself to the PR's reviewer list.
## Development ## Development
@@ -380,32 +409,38 @@ Each persona posts independently with its own sentinel, so reviews don't interfe
### Custom Personas ### Custom Personas
Create a JSON file with your domain-specific review focus: Create a YAML file with your domain-specific review focus:
```json ```yaml
// .review/personas/trading.json # .review/personas/trading.yaml
{ name: trading
"name": "trading", display_name: Trading Domain Expert
"display_name": "Trading Domain Expert",
"identity": "You are a trading systems expert reviewing code for correctness.\n\nYour expertise:\n- Order lifecycle and state machines\n- Fill handling and partial fills\n- Position tracking and P&L calculations\n- Event sourcing invariants", identity: |
"focus": [ You are a trading systems expert reviewing code for correctness.
"Order state machine correctness",
"Fill handling edge cases (partial, overfill)", Your expertise:
"Position and P&L calculation accuracy", - Order lifecycle and state machines
"Event replay determinism", - Fill handling and partial fills
"Decimal precision for money" - Position tracking and P&L calculations
], - Event sourcing invariants
"ignore": [
"Code style", focus:
"General performance", - Order state machine correctness
"Documentation formatting" - Fill handling edge cases (partial, overfill)
], - Position and P&L calculation accuracy
"severity": { - Event replay determinism
"major": "Bugs that cause incorrect positions, fills, or money calculations", - Decimal precision for money
"minor": "Edge cases that could cause issues under unusual conditions",
"nit": "Clarity improvements for domain logic" ignore:
} - Code style
} - General performance
- Documentation formatting
severity:
major: "Bugs that cause incorrect positions, fills, or money calculations"
minor: "Edge cases that could cause issues under unusual conditions"
nit: "Clarity improvements for domain logic"
``` ```
Use it in CI: Use it in CI:
@@ -414,17 +449,24 @@ Use it in CI:
- uses: rodin/review-bot/.gitea/actions/review@v1 - uses: rodin/review-bot/.gitea/actions/review@v1
with: with:
reviewer-name: trading reviewer-name: trading
persona-file: .review/personas/trading.json persona-file: .review/personas/trading.yaml
... ...
``` ```
YAML is the recommended format for personas because it supports:
- Multi-line strings with `|` blocks (cleaner identity definitions)
- Comments for documentation
- More readable arrays and nested structures
JSON is also supported for backwards compatibility—just use `.json` extension.
### Persona vs system-prompt-file ### Persona vs system-prompt-file
| Feature | `persona` / `persona-file` | `system-prompt-file` | | Feature | `persona` / `persona-file` | `system-prompt-file` |
|---------|---------------------------|----------------------| |---------|---------------------------|----------------------|
| Replaces base prompt | Yes | No (appends) | | Replaces base prompt | Yes | No (appends) |
| Structured format | Yes (JSON) | No (freeform) | | Structured format | Yes (YAML/JSON) | No (freeform) |
| Focus/ignore lists | Yes | Manual | | Focus/ignore lists | Yes | Manual |
| Severity calibration | Yes | Manual | | Severity calibration | Yes | Manual |
| Header display name | Yes | No | | Header display name | Yes | No |
+111 -28
View File
@@ -69,9 +69,15 @@ func main() {
dryRun := flag.Bool("dry-run", false, "Print review to stdout instead of posting") dryRun := flag.Bool("dry-run", false, "Print review to stdout instead of posting")
llmTemp := flag.Float64("llm-temperature", envOrDefaultFloat("LLM_TEMPERATURE", 0), "LLM temperature (0 = server default)") llmTemp := flag.Float64("llm-temperature", envOrDefaultFloat("LLM_TEMPERATURE", 0), "LLM temperature (0 = server default)")
llmTimeout := flag.Int("llm-timeout", envOrDefaultInt("LLM_TIMEOUT", 300), "LLM request timeout in seconds (default 300)") llmTimeout := flag.Int("llm-timeout", envOrDefaultInt("LLM_TIMEOUT", 300), "LLM request timeout in seconds (default 300)")
llmProvider := flag.String("llm-provider", envOrDefault("LLM_PROVIDER", "openai"), "LLM API provider: openai or anthropic") llmProvider := flag.String("llm-provider", envOrDefault("LLM_PROVIDER", "openai"), "LLM API provider: openai, anthropic, or aicore")
personaName := flag.String("persona", envOrDefault("PERSONA", ""), "Built-in persona name (security, architect, docs)") personaName := flag.String("persona", envOrDefault("PERSONA", ""), "Built-in persona name (security, architect, docs)")
personaFile := flag.String("persona-file", envOrDefault("PERSONA_FILE", ""), "Path to persona JSON file") personaFile := flag.String("persona-file", envOrDefault("PERSONA_FILE", ""), "Path to persona JSON file")
// AI Core specific flags (only used when provider=aicore)
aicoreClientID := flag.String("aicore-client-id", envOrDefault("AICORE_CLIENT_ID", ""), "SAP AI Core client ID (for provider=aicore)")
aicoreClientSecret := flag.String("aicore-client-secret", envOrDefault("AICORE_CLIENT_SECRET", ""), "SAP AI Core client secret (for provider=aicore)")
aicoreAuthURL := flag.String("aicore-auth-url", envOrDefault("AICORE_AUTH_URL", ""), "SAP AI Core auth URL (for provider=aicore)")
aicoreAPIURL := flag.String("aicore-api-url", envOrDefault("AICORE_API_URL", ""), "SAP AI Core API URL (for provider=aicore)")
aicoreResourceGroup := flag.String("aicore-resource-group", envOrDefault("AICORE_RESOURCE_GROUP", "default"), "SAP AI Core resource group (for provider=aicore)")
flag.Parse() flag.Parse()
@@ -86,10 +92,20 @@ func main() {
slog.Info("review-bot starting", "version", version) slog.Info("review-bot starting", "version", version)
// Validate required fields // Validate required fields
if *giteaURL == "" || *repo == "" || *prNum == "" || *reviewerToken == "" || // For aicore provider, llm-base-url and llm-api-key are not required
*llmBaseURL == "" || *llmAPIKey == "" || *llmModel == "" { isAICore := llm.Provider(*llmProvider) == llm.ProviderAICore
if *giteaURL == "" || *repo == "" || *prNum == "" || *reviewerToken == "" || *llmModel == "" {
fmt.Fprintf(os.Stderr, "Error: missing required flags or environment variables\n\n") fmt.Fprintf(os.Stderr, "Error: missing required flags or environment variables\n\n")
fmt.Fprintf(os.Stderr, "Required: --gitea-url, --repo, --pr, --reviewer-token, --llm-base-url, --llm-api-key, --llm-model\n") fmt.Fprintf(os.Stderr, "Required: --gitea-url, --repo, --pr, --reviewer-token, --llm-model\n")
os.Exit(1)
}
if !isAICore && (*llmBaseURL == "" || *llmAPIKey == "") {
fmt.Fprintf(os.Stderr, "Error: --llm-base-url and --llm-api-key are required for provider=%s\n", *llmProvider)
os.Exit(1)
}
if isAICore && (*aicoreClientID == "" || *aicoreClientSecret == "" || *aicoreAuthURL == "" || *aicoreAPIURL == "") {
fmt.Fprintf(os.Stderr, "Error: AI Core credentials required for provider=aicore\n\n")
fmt.Fprintf(os.Stderr, "Required: --aicore-client-id, --aicore-client-secret, --aicore-auth-url, --aicore-api-url\n")
os.Exit(1) os.Exit(1)
} }
@@ -99,29 +115,7 @@ func main() {
os.Exit(1) os.Exit(1)
} }
// Load persona if specified // NOTE: Persona loading deferred until after Gitea client init to support repo personas
var persona *review.Persona
if *personaName != "" {
var err error
persona, err = review.LoadBuiltinPersona(*personaName)
if err != nil {
slog.Error("failed to load persona", "persona", *personaName, "error", err)
os.Exit(1)
}
slog.Info("loaded built-in persona", "persona", persona.Name, "display", persona.DisplayName)
} else if *personaFile != "" {
resolvedPath, err := validateWorkspacePath(*personaFile, "persona-file")
if err != nil {
slog.Error("invalid persona-file path", "error", err)
os.Exit(1)
}
persona, err = review.LoadPersona(resolvedPath)
if err != nil {
slog.Error("failed to load persona file", "file", *personaFile, "error", err)
os.Exit(1)
}
slog.Info("loaded persona from file", "file", *personaFile, "persona", persona.Name)
}
// Validate reviewer-name: only safe characters allowed in sentinel // Validate reviewer-name: only safe characters allowed in sentinel
if err := validateReviewerName(*reviewerName); err != nil { if err := validateReviewerName(*reviewerName); err != nil {
@@ -157,8 +151,17 @@ func main() {
switch llm.Provider(*llmProvider) { switch llm.Provider(*llmProvider) {
case llm.ProviderOpenAI, llm.ProviderAnthropic: case llm.ProviderOpenAI, llm.ProviderAnthropic:
llmClient.WithProvider(llm.Provider(*llmProvider)) llmClient.WithProvider(llm.Provider(*llmProvider))
case llm.ProviderAICore:
llmClient.WithAICore(llm.AICoreConfig{
ClientID: *aicoreClientID,
ClientSecret: *aicoreClientSecret,
AuthURL: *aicoreAuthURL,
APIURL: *aicoreAPIURL,
ResourceGroup: *aicoreResourceGroup,
})
slog.Info("using SAP AI Core provider", "resource_group", *aicoreResourceGroup)
default: default:
slog.Error("invalid LLM provider", "provider", *llmProvider, "valid", "openai, anthropic") slog.Error("invalid LLM provider", "provider", *llmProvider, "valid", "openai, anthropic, aicore")
os.Exit(1) os.Exit(1)
} }
if *llmTimeout > 0 { if *llmTimeout > 0 {
@@ -170,6 +173,43 @@ func main() {
ctx, cancel := context.WithTimeout(context.Background(), overallTimeout) ctx, cancel := context.WithTimeout(context.Background(), overallTimeout)
defer cancel() defer cancel()
// Load persona if specified (after Gitea client init to support repo personas)
var persona *review.Persona
if *personaName != "" {
// Try loading from repo first, then fall back to built-in
repoPersonas, err := review.LoadRepoPersonas(ctx, newGiteaClientAdapter(giteaClient), owner, repoName)
if err != nil {
slog.Warn("could not load repo personas", "repo", owner+"/"+repoName, "error", err)
// Continue with built-in personas only.
// NOTE: repoPersonas is nil here, but map indexing on a nil map is safe in Go
// (returns the zero value), so the fallback to built-in below works correctly.
}
if p, ok := repoPersonas[*personaName]; ok {
persona = p
slog.Info("loaded repo persona", "persona", persona.Name, "display", persona.DisplayName, "repo", owner+"/"+repoName)
} else {
// Fall back to built-in
persona, err = review.LoadBuiltinPersona(*personaName)
if err != nil {
slog.Error("failed to load persona", "persona", *personaName, "error", err)
os.Exit(1)
}
slog.Info("loaded built-in persona", "persona", persona.Name, "display", persona.DisplayName)
}
} else if *personaFile != "" {
resolvedPath, err := validateWorkspacePath(*personaFile, "persona-file")
if err != nil {
slog.Error("invalid persona-file path", "error", err)
os.Exit(1)
}
persona, err = review.LoadPersona(resolvedPath)
if err != nil {
slog.Error("failed to load persona file", "file", *personaFile, "error", err)
os.Exit(1)
}
slog.Info("loaded persona from file", "file", *personaFile, "persona", persona.Name)
}
slog.Info("reviewing pull request", "pr", prNumber, "repo", fmt.Sprintf("%s/%s", owner, repoName)) slog.Info("reviewing pull request", "pr", prNumber, "repo", fmt.Sprintf("%s/%s", owner, repoName))
// Step 1: Fetch PR metadata // Step 1: Fetch PR metadata
@@ -504,6 +544,9 @@ func fetchPatterns(ctx context.Context, client *gitea.Client, patternsRepo, patt
} }
owner, repo := parts[0], parts[1] owner, repo := parts[0], parts[1]
var repoLoadedFiles []string
var repoSkippedFiles []string
for _, path := range paths { for _, path := range paths {
path = strings.TrimSpace(path) path = strings.TrimSpace(path)
if path == "" { if path == "" {
@@ -519,11 +562,22 @@ func fetchPatterns(ctx context.Context, client *gitea.Client, patternsRepo, patt
for filePath, content := range files { for filePath, content := range files {
// Only include markdown and text files as patterns // Only include markdown and text files as patterns
if !isPatternFile(filePath) { if !isPatternFile(filePath) {
repoSkippedFiles = append(repoSkippedFiles, filePath)
continue continue
} }
repoLoadedFiles = append(repoLoadedFiles, filePath)
sb.WriteString(fmt.Sprintf("### %s/%s\n\n%s\n\n", repoRef, filePath, content)) sb.WriteString(fmt.Sprintf("### %s/%s\n\n%s\n\n", repoRef, filePath, content))
} }
} }
if len(repoLoadedFiles) > 0 {
slog.Info("loaded pattern files", "repo", repoRef, "count", len(repoLoadedFiles), "files", repoLoadedFiles)
} else {
slog.Warn("no pattern files loaded", "repo", repoRef, "paths", paths)
}
if len(repoSkippedFiles) > 0 {
slog.Debug("skipped non-pattern files", "repo", repoRef, "count", len(repoSkippedFiles), "files", repoSkippedFiles)
}
} }
return sb.String() return sb.String()
} }
@@ -757,3 +811,32 @@ func shouldSkipStaleReview(evaluatedSHA, currentSHA string) bool {
} }
return evaluatedSHA != currentSHA return evaluatedSHA != currentSHA
} }
// giteaClientAdapter adapts gitea.Client to review.GiteaClient interface.
type giteaClientAdapter struct {
client *gitea.Client
}
func newGiteaClientAdapter(c *gitea.Client) *giteaClientAdapter {
return &giteaClientAdapter{client: c}
}
func (a *giteaClientAdapter) ListContents(ctx context.Context, owner, repo, path string) ([]review.ContentEntry, error) {
entries, err := a.client.ListContents(ctx, owner, repo, path)
if err != nil {
return nil, err
}
result := make([]review.ContentEntry, len(entries))
for i, e := range entries {
result[i] = review.ContentEntry{
Name: e.Name,
Path: e.Path,
Type: e.Type,
}
}
return result, nil
}
func (a *giteaClientAdapter) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
return a.client.GetFileContent(ctx, owner, repo, filepath)
}
+108
View File
@@ -0,0 +1,108 @@
# Design: YAML Support for Persona Files (#57)
## Problem
JSON is awkward for persona files that contain multi-line text (identity, severity descriptions). YAML supports cleaner multi-line strings and comments, improving readability and maintainability.
## Constraints
- Backwards compatibility: existing JSON personas must continue to work
- Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486)
- Consistency: use `.yaml` extension (not `.yml`)
- Library: use `gopkg.in/yaml.v3` (approved in CONVENTIONS.md) with explicit depth limiting
## Proposed Approach
1. **Update `parsePersona`** to detect format from file extension
2. **Add YAML parsing** with explicit depth limit (defense in depth)
3. **Keep JSON as fallback** for files without `.yaml`/`.yml` extension
4. **Convert built-in personas** to YAML format
5. **Update embed directive** to include both formats
### File Extension Detection
```go
func parsePersona(data []byte, source string) (*Persona, error) {
isYAML := strings.HasSuffix(source, ".yaml") || strings.HasSuffix(source, ".yml")
if isYAML {
return parseYAML(data, source)
}
return parseJSON(data, source)
}
```
### YAML Parsing with Depth Protection
```go
func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
var node yaml.Node
dec := yaml.NewDecoder(bytes.NewReader(data))
if err := dec.Decode(&node); err != nil {
return err
}
if err := checkYAMLDepth(&node, 0, maxDepth); err != nil {
return err
}
return node.Decode(out)
}
func checkYAMLDepth(node *yaml.Node, depth, maxDepth int) error {
if depth > maxDepth {
return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth)
}
// Handle alias nodes by following the Alias pointer
if node.Kind == yaml.AliasNode && node.Alias != nil {
return checkYAMLDepth(node.Alias, depth, maxDepth)
}
for _, child := range node.Content {
if err := checkYAMLDepth(child, depth+1, maxDepth); err != nil {
return err
}
}
return nil
}
```
The `gopkg.in/yaml.v3` library does not have built-in depth protection, so we implement explicit depth checking by first decoding into a `yaml.Node`, walking the tree to verify depth (including alias resolution), then decoding into the target struct.
## State/Data Model
No new state. Same `Persona` struct, just different parsing.
## Error Cases
| Error | Handling |
|-------|----------|
| Invalid YAML syntax | Return parse error with source file |
| Deeply nested YAML | Library rejects (v1.16.0+ fix) |
| Unknown extension | Fall back to JSON parsing |
| Missing required fields | Validation rejects after parse |
## Edge Cases
- File with `.json` extension but YAML content → JSON parse fails, user sees error
- File with no extension → defaults to JSON
- Embedded persona reference like `builtin:security` → detect by embed path (`personas/X.yaml`)
## Testing Strategy
1. Unit tests for YAML parsing (valid, invalid, deeply nested)
2. Unit tests for extension detection
3. Integration test for built-in personas (now YAML)
4. Backwards compat test: verify JSON still works for external files
## Completion Checklist
1. [ ] `go-yaml` dependency added at v1.16.0+
2. [ ] Extension detection uses case-insensitive comparison
3. [ ] YAML parse errors include source file name
4. [ ] JSON parsing still works for `.json` files
5. [ ] Built-in personas converted to YAML with readable multi-line strings
6. [ ] Embed directive updated to include `*.yaml`
7. [ ] Test for deeply nested YAML rejection
8. [ ] All existing tests pass
## Open Questions
- Should we support both `.yaml` AND `.yml`? Issue says `.yaml` only for consistency, but some users expect `.yml`. **Decision:** Support both for reading, recommend `.yaml` in docs.
- Should we add a "format" field to detect mismatched extension/content? **Decision:** No, keep it simple. Extension determines format.
+193 -6
View File
@@ -11,9 +11,11 @@ import (
"fmt" "fmt"
"io" "io"
"log/slog" "log/slog"
"net"
"net/http" "net/http"
"net/url" "net/url"
"strings" "strings"
"syscall"
"time" "time"
) )
@@ -39,12 +41,26 @@ func IsNotFound(err error) bool {
return errors.As(err, &apiErr) && apiErr.StatusCode == http.StatusNotFound return errors.As(err, &apiErr) && apiErr.StatusCode == http.StatusNotFound
} }
// IsServerError reports whether an error is an API 5xx response.
func IsServerError(err error) bool {
var apiErr *APIError
return errors.As(err, &apiErr) && apiErr.StatusCode >= 500 && apiErr.StatusCode < 600
}
// Client interacts with the Gitea API. // Client interacts with the Gitea API.
// A Client is safe for concurrent use by multiple goroutines. // A Client is safe for concurrent use by multiple goroutines.
type Client struct { type Client struct {
baseURL string baseURL string
token string token string
http *http.Client http *http.Client
// RetryBackoff defines the delays between retry attempts.
// RetryBackoff[i] is the delay before attempt i+1 (after attempt i fails).
// If nil, defaults to {1s, 2s}. Set to shorter durations in tests.
//
// This field must be configured before the first request is made.
// Modifying it while requests are in flight is not safe.
RetryBackoff []time.Duration
} }
// NewClient creates a new Gitea API client. // NewClient creates a new Gitea API client.
@@ -56,6 +72,12 @@ func NewClient(baseURL, token string) *Client {
} }
} }
// SetHTTPClient sets the underlying HTTP client used for requests.
// This is intended for testing to inject mock transports.
func (c *Client) SetHTTPClient(hc *http.Client) {
c.http = hc
}
// PullRequest holds relevant PR metadata. // PullRequest holds relevant PR metadata.
type PullRequest struct { type PullRequest struct {
Title string `json:"title"` Title string `json:"title"`
@@ -210,7 +232,140 @@ func (c *Client) PostReview(ctx context.Context, owner, repo string, number int,
return &review, nil return &review, nil
} }
// isTemporaryNetError reports whether err is a temporary network error worth retrying.
// This includes connection refused, network unreachable, connection reset, and DNS
// timeouts. It explicitly excludes permanent errors like permission denied or
// "no such host" DNS failures.
func isTemporaryNetError(err error) bool {
if err == nil {
return false
}
// Check for OpError and inspect the underlying syscall error.
// Not all OpErrors are transient — permission denied, for example, is permanent.
var opErr *net.OpError
if errors.As(err, &opErr) {
return isRetriableSyscallError(opErr.Err)
}
// DNS errors: only retry on timeout, not on "no such host" which is permanent.
var dnsErr *net.DNSError
if errors.As(err, &dnsErr) {
return dnsErr.IsTimeout
}
// Check for net.Error with Timeout() (Temporary is deprecated)
var netErr net.Error
if errors.As(err, &netErr) {
return netErr.Timeout()
}
return false
}
// isRetriableSyscallError reports whether the underlying error from a net.OpError
// is a transient syscall error worth retrying.
func isRetriableSyscallError(err error) bool {
if err == nil {
return false
}
// Check for syscall.Errno directly or wrapped
var errno syscall.Errno
if errors.As(err, &errno) {
switch errno {
case syscall.ECONNREFUSED, // connection refused — server not listening
syscall.ECONNRESET, // connection reset by peer
syscall.ENETUNREACH, // network unreachable
syscall.EHOSTUNREACH, // host unreachable
syscall.ETIMEDOUT: // connection timed out
return true
default:
// EACCES, EPERM, etc. are permanent — don't retry
return false
}
}
// If we can't identify the specific syscall error, be conservative and retry.
// This handles wrapped errors or platform-specific error types.
// The retry count is limited, so erring on the side of retrying is safe.
return true
}
// redactURL strips query parameters from a URL for safe logging.
// This prevents accidental exposure of sensitive data that future callers
// might pass via query strings.
func redactURL(rawURL string) string {
parsed, err := url.Parse(rawURL)
if err != nil {
// If we cannot parse it, return a safe placeholder rather than
// potentially logging something sensitive.
return "[invalid URL]"
}
if parsed.RawQuery != "" {
parsed.RawQuery = "[redacted]"
}
return parsed.String()
}
// sanitizeErrorForLog returns a loggable version of an error that omits
// potentially sensitive content like response bodies. For APIError, only
// the status code is included; for other errors, the type is preserved.
func sanitizeErrorForLog(err error) string {
if err == nil {
return "<nil>"
}
var apiErr *APIError
if errors.As(err, &apiErr) {
return fmt.Sprintf("HTTP %d", apiErr.StatusCode)
}
return err.Error()
}
// doGet performs an HTTP GET request with retry on 5xx errors and temporary
// network errors. Retries up to 3 times with exponential backoff (1s, 2s delays
// by default; configurable via Client.RetryBackoff for testing).
func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) { func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
const maxAttempts = 3
// backoff[i] is the delay before attempt i+1 (i.e., after attempt i fails).
// First attempt (i=0) has no delay; retries wait 1s then 2s by default.
backoff := c.RetryBackoff
if backoff == nil {
backoff = []time.Duration{1 * time.Second, 2 * time.Second}
}
// maxErrorBodyBytes limits how much of an error response body we read
// to protect against malicious servers sending unbounded data.
const maxErrorBodyBytes = 64 * 1024 // 64 KB
var lastErr error
for attempt := 0; attempt < maxAttempts; attempt++ {
if attempt > 0 {
// Determine delay: use backoff slice if available, otherwise retry immediately.
// An empty RetryBackoff slice means "retry without delay" — this is intentional
// as the caller explicitly configured no delays.
var delay time.Duration
if attempt-1 < len(backoff) {
delay = backoff[attempt-1]
}
if delay > 0 {
slog.Warn("retrying request after error",
"attempt", attempt+1,
"url", redactURL(reqURL),
"delay", delay.String(),
"lastError", sanitizeErrorForLog(lastErr))
timer := time.NewTimer(delay)
select {
case <-timer.C:
case <-ctx.Done():
timer.Stop()
return nil, ctx.Err()
}
}
}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, reqURL, nil) req, err := http.NewRequestWithContext(ctx, http.MethodGet, reqURL, nil)
if err != nil { if err != nil {
return nil, err return nil, err
@@ -218,16 +373,44 @@ func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
req.Header.Set("Authorization", "token "+c.token) req.Header.Set("Authorization", "token "+c.token)
resp, err := c.http.Do(req) resp, err := c.http.Do(req)
if err != nil {
// Always capture the error for consistent return at loop end.
// This ensures both network errors and HTTP 5xx return lastErr.
lastErr = err
// Only retry temporary network errors when attempts remain.
if attempt < maxAttempts-1 && isTemporaryNetError(err) {
slog.Warn("temporary network error, will retry",
"attempt", attempt+1,
"url", redactURL(reqURL),
"error", err)
continue
}
// Non-retryable network error or final attempt exhausted.
return nil, lastErr
}
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
body, err := io.ReadAll(resp.Body)
resp.Body.Close()
if err != nil { if err != nil {
return nil, err return nil, err
} }
defer resp.Body.Close() return body, nil
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
body, _ := io.ReadAll(resp.Body)
return nil, &APIError{StatusCode: resp.StatusCode, Body: string(body)}
} }
return io.ReadAll(resp.Body)
// Error path: limit how much we read from potentially malicious server
errBody, _ := io.ReadAll(io.LimitReader(resp.Body, maxErrorBodyBytes))
resp.Body.Close()
lastErr = &APIError{StatusCode: resp.StatusCode, Body: string(errBody)}
// Only retry on 5xx server errors
if resp.StatusCode < 500 || resp.StatusCode >= 600 {
return nil, lastErr
}
}
return nil, lastErr
} }
// escapePath escapes each segment of a relative file path for use in URLs. // escapePath escapes each segment of a relative file path for use in URLs.
@@ -252,6 +435,10 @@ type ContentEntry struct {
// ListContents lists files and directories at a given path in a repo. // ListContents lists files and directories at a given path in a repo.
// Pass an empty path to list the repository root. // Pass an empty path to list the repository root.
func (c *Client) ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error) { func (c *Client) ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error) {
// Normalize "." to empty string — Gitea API rejects "." with 500
if path == "." {
path = ""
}
var reqURL string var reqURL string
if path == "" { if path == "" {
reqURL = fmt.Sprintf("%s/api/v1/repos/%s/%s/contents", c.baseURL, url.PathEscape(owner), url.PathEscape(repo)) reqURL = fmt.Sprintf("%s/api/v1/repos/%s/%s/contents", c.baseURL, url.PathEscape(owner), url.PathEscape(repo))
+372
View File
@@ -6,10 +6,14 @@ import (
"errors" "errors"
"fmt" "fmt"
"io" "io"
"net"
"net/http" "net/http"
"net/http/httptest" "net/http/httptest"
"strings" "strings"
"sync/atomic"
"syscall"
"testing" "testing"
"time"
) )
func TestGetPullRequest(t *testing.T) { func TestGetPullRequest(t *testing.T) {
@@ -276,6 +280,30 @@ func TestListContents(t *testing.T) {
} }
} }
func TestListContents_DotPath(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// "." should be normalized to empty path, which hits the root contents endpoint
if r.URL.Path != "/api/v1/repos/owner/repo/contents" {
t.Errorf("expected root contents path, got: %s", r.URL.Path)
}
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, `[{"name":"README.md","path":"README.md","type":"file"}]`)
}))
defer server.Close()
client := NewClient(server.URL, "test-token")
entries, err := client.ListContents(context.Background(), "owner", "repo", ".")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(entries) != 1 {
t.Fatalf("expected 1 entry, got %d", len(entries))
}
if entries[0].Name != "README.md" {
t.Errorf("expected README.md, got %s", entries[0].Name)
}
}
func TestGetAllFilesInPath_File(t *testing.T) { func TestGetAllFilesInPath_File(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/api/v1/repos/owner/repo/contents/README.md" { if r.URL.Path == "/api/v1/repos/owner/repo/contents/README.md" {
@@ -743,3 +771,347 @@ func TestResolveComment_Error(t *testing.T) {
t.Fatal("expected error for 404 response") t.Fatal("expected error for 404 response")
} }
} }
func TestIsServerError(t *testing.T) {
tests := []struct {
name string
err error
want bool
}{
{"nil error", nil, false},
{"non-API error", fmt.Errorf("network timeout"), false},
{"404 APIError", &APIError{StatusCode: 404, Body: "not found"}, false},
{"500 APIError", &APIError{StatusCode: 500, Body: "server error"}, true},
{"502 APIError", &APIError{StatusCode: 502, Body: "bad gateway"}, true},
{"503 APIError", &APIError{StatusCode: 503, Body: "unavailable"}, true},
{"599 APIError", &APIError{StatusCode: 599, Body: "edge case"}, true},
{"600 not server error", &APIError{StatusCode: 600, Body: "edge"}, false},
{"400 not server error", &APIError{StatusCode: 400, Body: "bad request"}, false},
{"wrapped 500", fmt.Errorf("fetch: %w", &APIError{StatusCode: 500, Body: "err"}), true},
{"wrapped 404", fmt.Errorf("fetch: %w", &APIError{StatusCode: 404, Body: "err"}), false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := IsServerError(tt.err)
if got != tt.want {
t.Errorf("IsServerError(%v) = %v, want %v", tt.err, got, tt.want)
}
})
}
}
func TestDoGet_RetriesOn500(t *testing.T) {
attempts := 0
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
if attempts < 3 {
w.WriteHeader(http.StatusInternalServerError)
w.Write([]byte(`{"message":"transient error"}`))
return
}
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"data":"success"}`))
}))
defer server.Close()
client := NewClient(server.URL, "test-token")
// Use short backoff for fast tests
client.RetryBackoff = []time.Duration{1 * time.Millisecond, 1 * time.Millisecond}
body, err := client.doGet(context.Background(), server.URL+"/test")
if err != nil {
t.Fatalf("expected success after retry, got error: %v", err)
}
if string(body) != `{"data":"success"}` {
t.Errorf("body = %q, want %q", string(body), `{"data":"success"}`)
}
if attempts != 3 {
t.Errorf("attempts = %d, want 3", attempts)
}
}
func TestDoGet_FailsAfterMaxRetries(t *testing.T) {
attempts := 0
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
w.WriteHeader(http.StatusInternalServerError)
w.Write([]byte(`{"message":"persistent error"}`))
}))
defer server.Close()
client := NewClient(server.URL, "test-token")
// Use short backoff for fast tests
client.RetryBackoff = []time.Duration{1 * time.Millisecond, 1 * time.Millisecond}
_, err := client.doGet(context.Background(), server.URL+"/test")
if err == nil {
t.Fatal("expected error after max retries")
}
var apiErr *APIError
if !errors.As(err, &apiErr) {
t.Fatalf("expected APIError, got: %v", err)
}
if apiErr.StatusCode != http.StatusInternalServerError {
t.Errorf("status = %d, want 500", apiErr.StatusCode)
}
if attempts != 3 {
t.Errorf("attempts = %d, want 3 (max retries)", attempts)
}
}
func TestDoGet_NoRetryOn4xx(t *testing.T) {
attempts := 0
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
w.WriteHeader(http.StatusForbidden)
w.Write([]byte(`{"message":"forbidden"}`))
}))
defer server.Close()
client := NewClient(server.URL, "test-token")
_, err := client.doGet(context.Background(), server.URL+"/test")
if err == nil {
t.Fatal("expected error for 403")
}
var apiErr *APIError
if !errors.As(err, &apiErr) {
t.Fatalf("expected APIError, got: %v", err)
}
if apiErr.StatusCode != http.StatusForbidden {
t.Errorf("status = %d, want 403", apiErr.StatusCode)
}
if attempts != 1 {
t.Errorf("attempts = %d, want 1 (no retry on 4xx)", attempts)
}
}
func TestDoGet_RespectsContextCancellation(t *testing.T) {
attempts := 0
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
w.WriteHeader(http.StatusInternalServerError)
w.Write([]byte(`{"message":"error"}`))
}))
defer server.Close()
ctx, cancel := context.WithCancel(context.Background())
client := NewClient(server.URL, "test-token")
// Use longer backoff to give us time to cancel during the wait
client.RetryBackoff = []time.Duration{100 * time.Millisecond, 100 * time.Millisecond}
// Cancel after first attempt returns and retry begins
go func() {
time.Sleep(20 * time.Millisecond)
cancel()
}()
_, err := client.doGet(ctx, server.URL+"/test")
if err == nil {
t.Fatal("expected error on context cancellation")
}
// Should have made 1 attempt, then context cancelled during backoff
if attempts != 1 {
t.Errorf("attempts = %d, expected 1 before context cancel during backoff", attempts)
}
}
// mockTransport is a test helper that returns errors for the first N calls,
// then delegates to a real server.
type mockTransport struct {
failCount int32 // number of failures remaining (atomic)
failErr error // error to return on failure
realServer *httptest.Server
attemptsMade atomic.Int32 // tracks total attempts
}
func (m *mockTransport) RoundTrip(req *http.Request) (*http.Response, error) {
m.attemptsMade.Add(1)
remaining := atomic.AddInt32(&m.failCount, -1)
if remaining >= 0 {
// Still have failures to return
return nil, m.failErr
}
// Redirect to real server
req.URL.Host = m.realServer.Listener.Addr().String()
req.URL.Scheme = "http"
return http.DefaultTransport.RoundTrip(req)
}
func TestDoGet_RetriesOnTemporaryNetError(t *testing.T) {
// Real server that will handle successful requests
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"status":"ok"}`))
}))
defer server.Close()
// Mock transport: fail twice with ECONNREFUSED, then succeed
mt := &mockTransport{
failCount: 2,
failErr: &net.OpError{Op: "dial", Net: "tcp", Err: syscall.ECONNREFUSED},
realServer: server,
}
client := NewClient("http://fake-host/", "test-token")
client.SetHTTPClient(&http.Client{Transport: mt})
client.RetryBackoff = []time.Duration{1 * time.Millisecond, 1 * time.Millisecond}
body, err := client.doGet(context.Background(), "http://fake-host/test")
if err != nil {
t.Fatalf("expected success after retries, got error: %v", err)
}
if string(body) != `{"status":"ok"}` {
t.Errorf("body = %q, want %q", string(body), `{"status":"ok"}`)
}
// Should have made exactly 3 attempts: 2 failures + 1 success
if got := mt.attemptsMade.Load(); got != 3 {
t.Errorf("attempts = %d, want 3 (2 failures + 1 success)", got)
}
}
func TestIsTemporaryNetError(t *testing.T) {
tests := []struct {
name string
err error
want bool
}{
{"nil error", nil, false},
{"plain error", fmt.Errorf("some error"), false},
// OpError with retriable syscall errors
{"OpError ECONNREFUSED", &net.OpError{Op: "dial", Err: syscall.ECONNREFUSED}, true},
{"OpError ECONNRESET", &net.OpError{Op: "read", Err: syscall.ECONNRESET}, true},
{"OpError ENETUNREACH", &net.OpError{Op: "dial", Err: syscall.ENETUNREACH}, true},
{"OpError EHOSTUNREACH", &net.OpError{Op: "dial", Err: syscall.EHOSTUNREACH}, true},
{"OpError ETIMEDOUT", &net.OpError{Op: "dial", Err: syscall.ETIMEDOUT}, true},
// OpError with permanent syscall errors — should NOT retry
{"OpError EACCES", &net.OpError{Op: "dial", Err: syscall.EACCES}, false},
{"OpError EPERM", &net.OpError{Op: "dial", Err: syscall.EPERM}, false},
// OpError with unknown inner error — conservative retry
{"OpError unknown inner", &net.OpError{Op: "dial", Err: fmt.Errorf("unknown")}, true},
// DNS errors
{"DNS timeout", &net.DNSError{IsTimeout: true}, true},
{"DNS no such host", &net.DNSError{IsTimeout: false, Name: "bad.host"}, false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := isTemporaryNetError(tt.err)
if got != tt.want {
t.Errorf("isTemporaryNetError(%v) = %v, want %v", tt.err, got, tt.want)
}
})
}
}
func TestIsRetriableSyscallError(t *testing.T) {
tests := []struct {
name string
err error
want bool
}{
{"nil", nil, false},
{"ECONNREFUSED", syscall.ECONNREFUSED, true},
{"ECONNRESET", syscall.ECONNRESET, true},
{"ENETUNREACH", syscall.ENETUNREACH, true},
{"EHOSTUNREACH", syscall.EHOSTUNREACH, true},
{"ETIMEDOUT", syscall.ETIMEDOUT, true},
{"EACCES (permanent)", syscall.EACCES, false},
{"EPERM (permanent)", syscall.EPERM, false},
{"ENOENT (permanent)", syscall.ENOENT, false},
{"unknown error", fmt.Errorf("something"), true}, // conservative retry
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := isRetriableSyscallError(tt.err)
if got != tt.want {
t.Errorf("isRetriableSyscallError(%v) = %v, want %v", tt.err, got, tt.want)
}
})
}
}
func TestRedactURL(t *testing.T) {
tests := []struct {
name string
input string
want string
}{
{
name: "no query params",
input: "https://gitea.example.com/api/v1/repos/owner/repo/pulls/1",
want: "https://gitea.example.com/api/v1/repos/owner/repo/pulls/1",
},
{
name: "with query params - redacts",
input: "https://gitea.example.com/api/v1/repos/owner/repo/raw/file?ref=main",
want: "https://gitea.example.com/api/v1/repos/owner/repo/raw/file?[redacted]",
},
{
name: "multiple query params",
input: "https://example.com/path?token=secret&page=1",
want: "https://example.com/path?[redacted]",
},
{
name: "invalid URL",
input: "://invalid",
want: "[invalid URL]",
},
{
name: "empty string",
input: "",
want: "",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := redactURL(tt.input)
if got != tt.want {
t.Errorf("redactURL(%q) = %q, want %q", tt.input, got, tt.want)
}
})
}
}
func TestSanitizeErrorForLog(t *testing.T) {
tests := []struct {
name string
err error
want string
}{
{
name: "nil error",
err: nil,
want: "<nil>",
},
{
name: "APIError omits body",
err: &APIError{StatusCode: 500, Body: "internal error: database connection failed"},
want: "HTTP 500",
},
{
name: "APIError with large body still only shows status",
err: &APIError{StatusCode: 502, Body: strings.Repeat("x", 1000)},
want: "HTTP 502",
},
{
name: "non-API error preserved",
err: fmt.Errorf("connection refused"),
want: "connection refused",
},
{
name: "wrapped APIError",
err: fmt.Errorf("request failed: %w", &APIError{StatusCode: 503, Body: "service unavailable"}),
want: "HTTP 503",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := sanitizeErrorForLog(tt.err)
if got != tt.want {
t.Errorf("sanitizeErrorForLog() = %q, want %q", got, tt.want)
}
})
}
}
+2
View File
@@ -1,3 +1,5 @@
module gitea.weiker.me/rodin/review-bot module gitea.weiker.me/rodin/review-bot
go 1.26.2 go 1.26.2
require gopkg.in/yaml.v3 v3.0.1
+4
View File
@@ -0,0 +1,4 @@
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
+391
View File
@@ -0,0 +1,391 @@
package llm
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"net/url"
"strings"
"sync"
"time"
)
// AICoreOpenAIAPIVersion is the API version used for OpenAI models through AI Core.
// Update this when SAP AI Core releases a new stable version.
const AICoreOpenAIAPIVersion = "2024-12-01-preview"
// maxErrorBodyLen limits the length of response bodies included in error messages
// to prevent leaking potentially sensitive upstream details in logs.
const maxErrorBodyLen = 200
// AICoreConfig holds SAP AI Core authentication and connection settings.
type AICoreConfig struct {
ClientID string
ClientSecret string
AuthURL string
APIURL string
ResourceGroup string
}
// AICoreClient wraps AI Core authentication and deployment discovery.
// Thread-safe for concurrent use after construction.
//
// Design: The deployment cache is populated once and never invalidated. This is
// acceptable for short-lived CI runner processes, but longer-lived deployments
// may want to add a TTL or re-fetch on errors.
type AICoreClient struct {
config AICoreConfig
http *http.Client
mu sync.RWMutex
token string
tokenExpiry time.Time
deployments map[string]string // model name -> deployment URL
}
// NewAICoreClient creates a new AI Core client with the given configuration.
// The client uses a default 5-minute timeout; use WithTimeout to customize.
func NewAICoreClient(cfg AICoreConfig) *AICoreClient {
return &AICoreClient{
config: cfg,
http: &http.Client{Timeout: 5 * time.Minute},
deployments: make(map[string]string),
}
}
// WithTimeout sets the HTTP request timeout for AI Core calls.
// This should be called during construction, before concurrent use.
func (c *AICoreClient) WithTimeout(d time.Duration) *AICoreClient {
c.http.Timeout = d
return c
}
// truncateBody truncates a response body for inclusion in error messages.
// This prevents leaking potentially sensitive upstream response details in logs.
func truncateBody(body []byte) string {
if len(body) <= maxErrorBodyLen {
return string(body)
}
return string(body[:maxErrorBodyLen]) + "..."
}
// getToken returns a valid OAuth token, refreshing if necessary.
func (c *AICoreClient) getToken(ctx context.Context) (string, error) {
c.mu.RLock()
if c.token != "" && time.Now().Add(5*time.Minute).Before(c.tokenExpiry) {
token := c.token
c.mu.RUnlock()
return token, nil
}
c.mu.RUnlock()
c.mu.Lock()
defer c.mu.Unlock()
// Double-check after acquiring write lock
if c.token != "" && time.Now().Add(5*time.Minute).Before(c.tokenExpiry) {
return c.token, nil
}
token, expiry, err := c.fetchToken(ctx)
if err != nil {
return "", err
}
c.token = token
c.tokenExpiry = expiry
return token, nil
}
func (c *AICoreClient) fetchToken(ctx context.Context) (string, time.Time, error) {
tokenURL := strings.TrimRight(c.config.AuthURL, "/") + "/oauth/token"
data := url.Values{}
data.Set("grant_type", "client_credentials")
data.Set("client_id", c.config.ClientID)
data.Set("client_secret", c.config.ClientSecret)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, tokenURL, strings.NewReader(data.Encode()))
if err != nil {
return "", time.Time{}, fmt.Errorf("create token request: %w", err)
}
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
resp, err := c.http.Do(req)
if err != nil {
return "", time.Time{}, fmt.Errorf("token request: %w", err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
return "", time.Time{}, fmt.Errorf("read token response: %w", err)
}
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
return "", time.Time{}, fmt.Errorf("token request failed (status %d): %s", resp.StatusCode, truncateBody(body))
}
var tokenResp struct {
AccessToken string `json:"access_token"`
ExpiresIn int `json:"expires_in"`
}
if err := json.Unmarshal(body, &tokenResp); err != nil {
return "", time.Time{}, fmt.Errorf("parse token response: %w", err)
}
if tokenResp.AccessToken == "" {
return "", time.Time{}, fmt.Errorf("empty access token in response")
}
expiry := time.Now().Add(time.Duration(tokenResp.ExpiresIn) * time.Second)
return tokenResp.AccessToken, expiry, nil
}
// getDeploymentURL returns the deployment URL for a model, fetching deployments if needed.
// getDeploymentURL returns the deployment URL for a model, fetching deployments if needed.
// Also returns a valid token for use by the caller, avoiding redundant getToken calls.
//
// Note: The token is fetched before acquiring the write lock to avoid holding the lock
// during network I/O. In rare cases where multiple goroutines race and one waits a long
// time for the write lock, the token could theoretically expire. The 5-minute refresh
// buffer in getToken makes this extremely unlikely in practice.
func (c *AICoreClient) getDeploymentURL(ctx context.Context, model string) (deployURL, token string, err error) {
c.mu.RLock()
if u, ok := c.deployments[model]; ok {
c.mu.RUnlock()
// Still need a token for the caller
token, err = c.getToken(ctx)
if err != nil {
return "", "", fmt.Errorf("get token: %w", err)
}
return u, token, nil
}
c.mu.RUnlock()
// Fetch token first (before acquiring write lock to avoid holding lock during I/O)
token, err = c.getToken(ctx)
if err != nil {
return "", "", fmt.Errorf("get token for deployments: %w", err)
}
c.mu.Lock()
defer c.mu.Unlock()
// Double-check after acquiring write lock
if u, ok := c.deployments[model]; ok {
return u, token, nil
}
if err := c.fetchDeployments(ctx, token); err != nil {
return "", "", err
}
if u, ok := c.deployments[model]; ok {
return u, token, nil
}
return "", "", fmt.Errorf("no deployment found for model %q", model)
}
func (c *AICoreClient) fetchDeployments(ctx context.Context, token string) error {
deployURL := strings.TrimRight(c.config.APIURL, "/") + "/v2/lm/deployments"
req, err := http.NewRequestWithContext(ctx, http.MethodGet, deployURL, nil)
if err != nil {
return fmt.Errorf("create deployments request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+token)
req.Header.Set("AI-Resource-Group", c.config.ResourceGroup)
resp, err := c.http.Do(req)
if err != nil {
return fmt.Errorf("deployments request: %w", err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
return fmt.Errorf("read deployments response: %w", err)
}
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
return fmt.Errorf("deployments request failed (status %d): %s", resp.StatusCode, truncateBody(body))
}
var deployResp struct {
Resources []struct {
DeploymentURL string `json:"deploymentUrl"`
Status string `json:"status"`
Details struct {
Resources struct {
BackendDetails struct {
Model struct {
Name string `json:"name"`
} `json:"model"`
} `json:"backend_details"`
} `json:"resources"`
} `json:"details"`
} `json:"resources"`
}
if err := json.Unmarshal(body, &deployResp); err != nil {
return fmt.Errorf("parse deployments response: %w", err)
}
for _, r := range deployResp.Resources {
if r.Status != "RUNNING" {
continue
}
modelName := r.Details.Resources.BackendDetails.Model.Name
if modelName == "" {
continue
}
c.deployments[modelName] = r.DeploymentURL
}
return nil
}
// CompleteAnthropic sends a request to an Anthropic model via AI Core.
func (c *AICoreClient) CompleteAnthropic(ctx context.Context, model string, messages []Message, maxTokens int, temperature float64) (string, error) {
deployURL, token, err := c.getDeploymentURL(ctx, model)
if err != nil {
return "", err
}
// Extract system message
var system string
var userMessages []anthropicMsg
for _, m := range messages {
if m.Role == "system" {
system = m.Content
} else {
userMessages = append(userMessages, anthropicMsg{
Role: m.Role,
Content: m.Content,
})
}
}
reqBody := anthropicRequest{
AnthropicVersion: "bedrock-2023-05-31", // SAP AI Core uses Bedrock format
// Model omitted - AI Core deployment already specifies model
MaxTokens: maxTokens,
System: system,
Messages: userMessages,
}
if temperature > 0 {
reqBody.Temperature = temperature
}
data, err := json.Marshal(reqBody)
if err != nil {
return "", fmt.Errorf("marshal request: %w", err)
}
// AI Core uses /invoke for Anthropic models
invokeURL := strings.TrimRight(deployURL, "/") + "/invoke"
req, err := http.NewRequestWithContext(ctx, http.MethodPost, invokeURL, bytes.NewReader(data))
if err != nil {
return "", fmt.Errorf("create request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+token)
req.Header.Set("AI-Resource-Group", c.config.ResourceGroup)
req.Header.Set("Content-Type", "application/json")
resp, err := c.http.Do(req)
if err != nil {
return "", fmt.Errorf("AI Core request: %w", err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
return "", fmt.Errorf("read response: %w", err)
}
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
return "", fmt.Errorf("AI Core API error (status %d): %s", resp.StatusCode, truncateBody(body))
}
var anthropicResp anthropicResponse
if err := json.Unmarshal(body, &anthropicResp); err != nil {
return "", fmt.Errorf("parse response: %w", err)
}
if len(anthropicResp.Content) == 0 {
return "", fmt.Errorf("no content in response")
}
var sb strings.Builder
for _, block := range anthropicResp.Content {
if block.Type == "text" {
sb.WriteString(block.Text)
}
}
result := sb.String()
if result == "" {
return "", fmt.Errorf("no text content in response")
}
return result, nil
}
// CompleteOpenAI sends a request to an OpenAI model via AI Core.
func (c *AICoreClient) CompleteOpenAI(ctx context.Context, model string, messages []Message, temperature float64) (string, error) {
deployURL, token, err := c.getDeploymentURL(ctx, model)
if err != nil {
return "", err
}
reqBody := ChatRequest{
Model: model,
Temperature: temperature,
Messages: messages,
}
data, err := json.Marshal(reqBody)
if err != nil {
return "", fmt.Errorf("marshal request: %w", err)
}
// AI Core uses /chat/completions?api-version=<version> for OpenAI models
chatURL := strings.TrimRight(deployURL, "/") + "/chat/completions?api-version=" + AICoreOpenAIAPIVersion
req, err := http.NewRequestWithContext(ctx, http.MethodPost, chatURL, bytes.NewReader(data))
if err != nil {
return "", fmt.Errorf("create request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+token)
req.Header.Set("AI-Resource-Group", c.config.ResourceGroup)
req.Header.Set("Content-Type", "application/json")
resp, err := c.http.Do(req)
if err != nil {
return "", fmt.Errorf("AI Core request: %w", err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
return "", fmt.Errorf("read response: %w", err)
}
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
return "", fmt.Errorf("AI Core API error (status %d): %s", resp.StatusCode, truncateBody(body))
}
var openaiResp ChatResponse
if err := json.Unmarshal(body, &openaiResp); err != nil {
return "", fmt.Errorf("parse response: %w", err)
}
if len(openaiResp.Choices) == 0 {
return "", fmt.Errorf("no choices in response")
}
return openaiResp.Choices[0].Message.Content, nil
}
// IsAnthropicModel returns true if the model name indicates an Anthropic model.
// SAP AI Core uses "anthropic--" prefix for Anthropic models (e.g., "anthropic--claude-3-5-sonnet").
func IsAnthropicModel(model string) bool {
return strings.HasPrefix(model, "anthropic--")
}
+535
View File
@@ -0,0 +1,535 @@
package llm
import (
"context"
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"strings"
"sync/atomic"
"testing"
"time"
)
func TestAICoreClient_TokenFetch(t *testing.T) {
tokenCalls := int32(0)
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/oauth/token" {
atomic.AddInt32(&tokenCalls, 1)
if r.Method != http.MethodPost {
t.Errorf("expected POST for token, got %s", r.Method)
}
if r.Header.Get("Content-Type") != "application/x-www-form-urlencoded" {
t.Errorf("expected form content type")
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]interface{}{
"access_token": "test-token-123",
"expires_in": 3600,
})
return
}
t.Errorf("unexpected path: %s", r.URL.Path)
}))
defer server.Close()
client := NewAICoreClient(AICoreConfig{
ClientID: "test-id",
ClientSecret: "test-secret",
AuthURL: server.URL,
APIURL: server.URL,
ResourceGroup: "default",
})
token, err := client.getToken(context.Background())
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if token != "test-token-123" {
t.Errorf("expected token 'test-token-123', got %q", token)
}
// Second call should use cached token
token2, err := client.getToken(context.Background())
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if token2 != "test-token-123" {
t.Errorf("expected cached token")
}
if atomic.LoadInt32(&tokenCalls) != 1 {
t.Errorf("expected 1 token call (cached), got %d", tokenCalls)
}
}
func TestAICoreClient_DeploymentFetch(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/oauth/token" {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]interface{}{
"access_token": "test-token",
"expires_in": 3600,
})
return
}
if r.URL.Path == "/v2/lm/deployments" {
if r.Header.Get("Authorization") != "Bearer test-token" {
t.Errorf("expected Bearer auth")
}
if r.Header.Get("AI-Resource-Group") != "default" {
t.Errorf("expected resource group header")
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]interface{}{
"resources": []map[string]interface{}{
{
"id": "deploy-123",
"deploymentUrl": "https://example.com/v2/inference/deployments/deploy-123",
"status": "RUNNING",
"details": map[string]interface{}{
"resources": map[string]interface{}{
"backend_details": map[string]interface{}{
"model": map[string]interface{}{
"name": "anthropic--claude-4.6-sonnet",
},
},
},
},
},
{
"id": "deploy-456",
"deploymentUrl": "https://example.com/v2/inference/deployments/deploy-456",
"status": "STOPPED",
"details": map[string]interface{}{
"resources": map[string]interface{}{
"backend_details": map[string]interface{}{
"model": map[string]interface{}{
"name": "gpt-5",
},
},
},
},
},
{
"id": "deploy-789",
"deploymentUrl": "https://example.com/v2/inference/deployments/deploy-789",
"status": "RUNNING",
"details": map[string]interface{}{
"resources": map[string]interface{}{
"backend_details": map[string]interface{}{
"model": map[string]interface{}{
"name": "gpt-5",
},
},
},
},
},
},
})
return
}
t.Errorf("unexpected path: %s", r.URL.Path)
}))
defer server.Close()
client := NewAICoreClient(AICoreConfig{
ClientID: "test-id",
ClientSecret: "test-secret",
AuthURL: server.URL,
APIURL: server.URL,
ResourceGroup: "default",
})
// Should find running deployment
url, _, err := client.getDeploymentURL(context.Background(), "anthropic--claude-4.6-sonnet")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if url != "https://example.com/v2/inference/deployments/deploy-123" {
t.Errorf("unexpected URL: %s", url)
}
// Should find running gpt-5, not stopped one
url, _, err = client.getDeploymentURL(context.Background(), "gpt-5")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if url != "https://example.com/v2/inference/deployments/deploy-789" {
t.Errorf("unexpected URL: %s", url)
}
// Should error on unknown model
_, _, err = client.getDeploymentURL(context.Background(), "unknown-model")
if err == nil {
t.Error("expected error for unknown model")
}
}
func TestAICoreClient_CompleteAnthropic(t *testing.T) {
// baseURL is set after server creation; captured by closure in handlers
var baseURL string
mux := http.NewServeMux()
mux.HandleFunc("/oauth/token", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]interface{}{
"access_token": "test-token",
"expires_in": 3600,
})
})
mux.HandleFunc("/v2/lm/deployments", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]interface{}{
"resources": []map[string]interface{}{
{
"id": "deploy-anthropic",
"deploymentUrl": baseURL + "/deployments/anthropic",
"status": "RUNNING",
"details": map[string]interface{}{
"resources": map[string]interface{}{
"backend_details": map[string]interface{}{
"model": map[string]interface{}{
"name": "anthropic--claude-4.6-sonnet",
},
},
},
},
},
},
})
})
mux.HandleFunc("/deployments/anthropic/invoke", func(w http.ResponseWriter, r *http.Request) {
if r.Header.Get("Authorization") != "Bearer test-token" {
t.Errorf("expected Bearer auth on invoke")
}
var req anthropicRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
t.Fatalf("decode request: %v", err)
}
if req.AnthropicVersion != "bedrock-2023-05-31" {
t.Errorf("expected bedrock anthropic_version in request")
}
if req.System != "You are helpful" {
t.Errorf("expected system prompt: %q", req.System)
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]interface{}{
"content": []map[string]interface{}{
{"type": "text", "text": "Hello from AI Core!"},
},
})
})
server := httptest.NewServer(mux)
baseURL = server.URL
defer server.Close()
client := NewAICoreClient(AICoreConfig{
ClientID: "test-id",
ClientSecret: "test-secret",
AuthURL: server.URL,
APIURL: server.URL,
ResourceGroup: "default",
})
result, err := client.CompleteAnthropic(context.Background(), "anthropic--claude-4.6-sonnet", []Message{
{Role: "system", Content: "You are helpful"},
{Role: "user", Content: "Hello"},
}, 8192, 0)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if result != "Hello from AI Core!" {
t.Errorf("expected 'Hello from AI Core!', got %q", result)
}
}
func TestAICoreClient_CompleteOpenAI(t *testing.T) {
var baseURL string
mux := http.NewServeMux()
mux.HandleFunc("/oauth/token", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]interface{}{
"access_token": "test-token",
"expires_in": 3600,
})
})
mux.HandleFunc("/v2/lm/deployments", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]interface{}{
"resources": []map[string]interface{}{
{
"id": "deploy-openai",
"deploymentUrl": baseURL + "/deployments/openai",
"status": "RUNNING",
"details": map[string]interface{}{
"resources": map[string]interface{}{
"backend_details": map[string]interface{}{
"model": map[string]interface{}{
"name": "gpt-5",
},
},
},
},
},
},
})
})
mux.HandleFunc("/deployments/openai/chat/completions", func(w http.ResponseWriter, r *http.Request) {
if r.URL.Query().Get("api-version") != AICoreOpenAIAPIVersion {
t.Errorf("expected api-version %s, got %s", AICoreOpenAIAPIVersion, r.URL.Query().Get("api-version"))
}
var req ChatRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
t.Fatalf("decode request: %v", err)
}
if req.Model != "gpt-5" {
t.Errorf("expected model gpt-5, got %s", req.Model)
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(ChatResponse{
Choices: []struct {
Message struct {
Content string `json:"content"`
} `json:"message"`
}{
{Message: struct {
Content string `json:"content"`
}{Content: "Hello from GPT-5!"}},
},
})
})
server := httptest.NewServer(mux)
baseURL = server.URL
defer server.Close()
client := NewAICoreClient(AICoreConfig{
ClientID: "test-id",
ClientSecret: "test-secret",
AuthURL: server.URL,
APIURL: server.URL,
ResourceGroup: "default",
})
result, err := client.CompleteOpenAI(context.Background(), "gpt-5", []Message{
{Role: "user", Content: "Hello"},
}, 0)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if result != "Hello from GPT-5!" {
t.Errorf("expected 'Hello from GPT-5!', got %q", result)
}
}
func TestIsAnthropicModel(t *testing.T) {
tests := []struct {
model string
expected bool
}{
// SAP AI Core uses "anthropic--" prefix for Anthropic models
{"anthropic--claude-4.6-sonnet", true},
{"anthropic--claude-4.6-opus", true},
{"anthropic--claude-3-5-sonnet", true},
// Non-prefixed model names are not detected as Anthropic
// (SAP AI Core always uses the prefix for Anthropic models)
{"claude-sonnet-4", false},
{"gpt-5", false},
{"gpt-4.1", false},
{"llama-3", false},
{"my-claude-model", false}, // Avoid false positives on "claude" substring
}
for _, tt := range tests {
got := IsAnthropicModel(tt.model)
if got != tt.expected {
t.Errorf("IsAnthropicModel(%q) = %v, want %v", tt.model, got, tt.expected)
}
}
}
func TestAICoreClient_TokenExpiry(t *testing.T) {
tokenCalls := int32(0)
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/oauth/token" {
call := atomic.AddInt32(&tokenCalls, 1)
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]interface{}{
"access_token": fmt.Sprintf("token-%d", call),
"expires_in": 1, // 1 second expiry
})
return
}
}))
defer server.Close()
client := NewAICoreClient(AICoreConfig{
ClientID: "test-id",
ClientSecret: "test-secret",
AuthURL: server.URL,
APIURL: server.URL,
ResourceGroup: "default",
})
// First call
token1, err := client.getToken(context.Background())
if err != nil {
t.Fatalf("first getToken: %v", err)
}
// Force token expiry by manipulating expiry time
client.mu.Lock()
client.tokenExpiry = time.Now().Add(-time.Hour)
client.mu.Unlock()
// Should fetch new token
token2, err := client.getToken(context.Background())
if err != nil {
t.Fatalf("second getToken: %v", err)
}
if token1 == token2 {
t.Error("expected different tokens after expiry")
}
if atomic.LoadInt32(&tokenCalls) != 2 {
t.Errorf("expected 2 token calls, got %d", tokenCalls)
}
}
func TestAICoreClient_WithTimeout(t *testing.T) {
client := NewAICoreClient(AICoreConfig{
ClientID: "test-id",
ClientSecret: "test-secret",
AuthURL: "https://auth.example.com",
APIURL: "https://api.example.com",
ResourceGroup: "default",
})
// Default timeout is 5 minutes
if client.http.Timeout != 5*time.Minute {
t.Errorf("expected default timeout 5m, got %v", client.http.Timeout)
}
// WithTimeout should update the timeout
client.WithTimeout(10 * time.Minute)
if client.http.Timeout != 10*time.Minute {
t.Errorf("expected timeout 10m, got %v", client.http.Timeout)
}
}
func TestClient_WithAICore(t *testing.T) {
client := NewClient("http://example.com", "key", "model")
if client.provider != ProviderOpenAI {
t.Errorf("expected default provider openai, got %s", client.provider)
}
client.WithAICore(AICoreConfig{
ClientID: "id",
ClientSecret: "secret",
AuthURL: "https://auth.example.com",
APIURL: "https://api.example.com",
ResourceGroup: "default",
})
if client.provider != ProviderAICore {
t.Errorf("expected provider aicore, got %s", client.provider)
}
if client.aicore == nil {
t.Error("expected aicore client to be set")
}
}
func TestClient_WithTimeout_PropagatestoAICore(t *testing.T) {
client := NewClient("http://example.com", "key", "model").
WithAICore(AICoreConfig{
ClientID: "id",
ClientSecret: "secret",
AuthURL: "https://auth.example.com",
APIURL: "https://api.example.com",
ResourceGroup: "default",
})
// Default should be 5 minutes (inherited from parent client)
if client.aicore.http.Timeout != 5*time.Minute {
t.Errorf("expected aicore default timeout 5m, got %v", client.aicore.http.Timeout)
}
// WithTimeout should propagate to AI Core client
client.WithTimeout(15 * time.Minute)
if client.http.Timeout != 15*time.Minute {
t.Errorf("expected parent timeout 15m, got %v", client.http.Timeout)
}
if client.aicore.http.Timeout != 15*time.Minute {
t.Errorf("expected aicore timeout 15m, got %v", client.aicore.http.Timeout)
}
}
func TestClient_CompleteAICore(t *testing.T) {
var baseURL string
mux := http.NewServeMux()
mux.HandleFunc("/oauth/token", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]interface{}{
"access_token": "test-token",
"expires_in": 3600,
})
})
mux.HandleFunc("/v2/lm/deployments", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]interface{}{
"resources": []map[string]interface{}{
{
"id": "deploy-test",
"deploymentUrl": baseURL + "/deployments/test",
"status": "RUNNING",
"details": map[string]interface{}{
"resources": map[string]interface{}{
"backend_details": map[string]interface{}{
"model": map[string]interface{}{
"name": "gpt-5",
},
},
},
},
},
},
})
})
mux.HandleFunc("/deployments/test/chat/completions", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(ChatResponse{
Choices: []struct {
Message struct {
Content string `json:"content"`
} `json:"message"`
}{
{Message: struct {
Content string `json:"content"`
}{Content: "AI Core via Client works!"}},
},
})
})
server := httptest.NewServer(mux)
baseURL = server.URL
defer server.Close()
client := NewClient("", "", "gpt-5").WithAICore(AICoreConfig{
ClientID: "test-id",
ClientSecret: "test-secret",
AuthURL: server.URL,
APIURL: server.URL,
ResourceGroup: "default",
})
result, err := client.Complete(context.Background(), []Message{
{Role: "user", Content: "Hello"},
})
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if !strings.Contains(result, "AI Core via Client works!") {
t.Errorf("unexpected result: %s", result)
}
}
+34 -3
View File
@@ -1,6 +1,6 @@
// Package llm provides clients for LLM chat completion APIs. // Package llm provides clients for LLM chat completion APIs.
// //
// Supports OpenAI-compatible (default) and Anthropic Messages API providers. // Supports OpenAI-compatible (default), Anthropic Messages API, and SAP AI Core providers.
package llm package llm
import ( import (
@@ -22,6 +22,8 @@ const (
ProviderOpenAI Provider = "openai" ProviderOpenAI Provider = "openai"
// ProviderAnthropic uses the Anthropic Messages API endpoint. // ProviderAnthropic uses the Anthropic Messages API endpoint.
ProviderAnthropic Provider = "anthropic" ProviderAnthropic Provider = "anthropic"
// ProviderAICore uses SAP AI Core with OAuth authentication.
ProviderAICore Provider = "aicore"
) )
// Client calls an LLM chat completion API. // Client calls an LLM chat completion API.
@@ -35,6 +37,7 @@ type Client struct {
temperature float64 temperature float64
provider Provider provider Provider
http *http.Client http *http.Client
aicore *AICoreClient // Only set when provider is aicore
} }
// NewClient creates a new LLM client. Default provider is OpenAI-compatible. // NewClient creates a new LLM client. Default provider is OpenAI-compatible.
@@ -49,8 +52,12 @@ func NewClient(baseURL, apiKey, model string) *Client {
} }
// WithTimeout sets the HTTP request timeout for LLM calls (default 5 minutes). // WithTimeout sets the HTTP request timeout for LLM calls (default 5 minutes).
// When using AI Core, this also sets the timeout on the AI Core client.
func (c *Client) WithTimeout(d time.Duration) *Client { func (c *Client) WithTimeout(d time.Duration) *Client {
c.http.Timeout = d c.http.Timeout = d
if c.aicore != nil {
c.aicore.WithTimeout(d)
}
return c return c
} }
@@ -60,12 +67,21 @@ func (c *Client) WithTemperature(t float64) *Client {
return c return c
} }
// WithProvider sets the API provider format (openai or anthropic). // WithProvider sets the API provider format (openai, anthropic, or aicore).
func (c *Client) WithProvider(p Provider) *Client { func (c *Client) WithProvider(p Provider) *Client {
c.provider = p c.provider = p
return c return c
} }
// WithAICore configures the client to use SAP AI Core for authentication.
// This sets the provider to aicore automatically.
// The AI Core client inherits the current HTTP timeout from this client.
func (c *Client) WithAICore(cfg AICoreConfig) *Client {
c.provider = ProviderAICore
c.aicore = NewAICoreClient(cfg).WithTimeout(c.http.Timeout)
return c
}
// Message represents a chat message. // Message represents a chat message.
type Message struct { type Message struct {
Role string `json:"role"` Role string `json:"role"`
@@ -82,6 +98,8 @@ func (c *Client) Complete(ctx context.Context, messages []Message) (string, erro
switch c.provider { switch c.provider {
case ProviderAnthropic: case ProviderAnthropic:
result, err = c.completeAnthropic(ctx, messages) result, err = c.completeAnthropic(ctx, messages)
case ProviderAICore:
result, err = c.completeAICore(ctx, messages)
default: default:
result, err = c.completeOpenAI(ctx, messages) result, err = c.completeOpenAI(ctx, messages)
} }
@@ -106,6 +124,18 @@ func (c *Client) Complete(ctx context.Context, messages []Message) (string, erro
return "", err return "", err
} }
// completeAICore routes to AI Core using the appropriate endpoint based on model type.
func (c *Client) completeAICore(ctx context.Context, messages []Message) (string, error) {
if c.aicore == nil {
return "", fmt.Errorf("AI Core client not configured")
}
if IsAnthropicModel(c.model) {
return c.aicore.CompleteAnthropic(ctx, c.model, messages, 8192, c.temperature)
}
return c.aicore.CompleteOpenAI(ctx, c.model, messages, c.temperature)
}
// isRetryableError returns true for transient errors worth retrying. // isRetryableError returns true for transient errors worth retrying.
func isRetryableError(err error) bool { func isRetryableError(err error) bool {
if err == nil { if err == nil {
@@ -176,7 +206,8 @@ func (c *Client) completeOpenAI(ctx context.Context, messages []Message) (string
// --- Anthropic Messages API implementation --- // --- Anthropic Messages API implementation ---
type anthropicRequest struct { type anthropicRequest struct {
Model string `json:"model"` AnthropicVersion string `json:"anthropic_version,omitempty"`
Model string `json:"model,omitempty"`
MaxTokens int `json:"max_tokens"` MaxTokens int `json:"max_tokens"`
System string `json:"system,omitempty"` System string `json:"system,omitempty"`
Messages []anthropicMsg `json:"messages"` Messages []anthropicMsg `json:"messages"`
+168 -21
View File
@@ -1,81 +1,153 @@
package review package review
import ( import (
"bytes"
"embed" "embed"
"encoding/json" "encoding/json"
"fmt" "fmt"
"os" "os"
"sort"
"strings" "strings"
"unicode/utf8" "unicode/utf8"
"gopkg.in/yaml.v3"
) )
//go:embed personas/*.json //go:embed personas/*.yaml
var embeddedPersonas embed.FS var embeddedPersonas embed.FS
// MaxPersonaFileSize is the maximum size for persona files (64 KB).
// This prevents denial-of-service via excessively large files.
const MaxPersonaFileSize = 64 * 1024
// MaxYAMLDepth is the maximum nesting depth allowed in YAML persona files.
// This prevents stack exhaustion from deeply nested structures.
const MaxYAMLDepth = 20
// MaxYAMLNodes is the maximum number of YAML nodes allowed in persona files.
// This prevents DoS via wide-but-shallow structures that bypass depth limits.
const MaxYAMLNodes = 1000
// Persona defines a specialized review role with focused expertise. // Persona defines a specialized review role with focused expertise.
type Persona struct { type Persona struct {
Name string `json:"name"` Name string `json:"name" yaml:"name"`
DisplayName string `json:"display_name"` DisplayName string `json:"display_name" yaml:"display_name"`
ModelPref string `json:"model_preference,omitempty"` ModelPref string `json:"model_preference,omitempty" yaml:"model_preference,omitempty"`
Identity string `json:"identity"` Identity string `json:"identity" yaml:"identity"`
Focus []string `json:"focus"` Focus []string `json:"focus" yaml:"focus"`
Ignore []string `json:"ignore"` Ignore []string `json:"ignore" yaml:"ignore"`
Severity Severity `json:"severity"` Severity Severity `json:"severity" yaml:"severity"`
OutputFormat string `json:"output_format,omitempty"` OutputFormat string `json:"output_format,omitempty" yaml:"output_format,omitempty"`
} }
// Severity defines what constitutes each severity level for this persona. // Severity defines what constitutes each severity level for this persona.
// These are prompt guidance for the LLM, not output format changes. // These are prompt guidance for the LLM, not output format changes.
type Severity struct { type Severity struct {
Major string `json:"major"` Major string `json:"major" yaml:"major"`
Minor string `json:"minor"` Minor string `json:"minor" yaml:"minor"`
Nit string `json:"nit"` Nit string `json:"nit" yaml:"nit"`
} }
// LoadPersona loads a persona from a JSON file path. // LoadPersona loads a persona from a JSON or YAML file path.
// Format is detected by file extension: .yaml/.yml for YAML, .json or other for JSON.
// Files larger than MaxPersonaFileSize are rejected.
//
// Symlinks are supported: os.Stat follows symlinks, so a symlink pointing to
// a regular file will pass the IsRegular() check. Symlinks to non-regular files
// (directories, FIFOs, devices) are still rejected.
func LoadPersona(path string) (*Persona, error) { func LoadPersona(path string) (*Persona, error) {
// os.Stat follows symlinks, so symlinks to regular files are supported.
// The IsRegular() check operates on the target, not the symlink itself.
info, err := os.Stat(path)
if err != nil {
return nil, fmt.Errorf("read persona file %s: %w", path, err)
}
if !info.Mode().IsRegular() {
return nil, fmt.Errorf("persona file %s is not a regular file", path)
}
if info.Size() > MaxPersonaFileSize {
return nil, fmt.Errorf("persona file %s exceeds maximum size (%d bytes)", path, MaxPersonaFileSize)
}
data, err := os.ReadFile(path) data, err := os.ReadFile(path)
if err != nil { if err != nil {
return nil, fmt.Errorf("read persona file %s: %w", path, err) return nil, fmt.Errorf("read persona file %s: %w", path, err)
} }
// Re-check size after read to defend against TOCTOU races where file
// grows between stat and read (e.g., appending process, replaced file).
if len(data) > MaxPersonaFileSize {
return nil, fmt.Errorf("persona file %s exceeds maximum size (%d bytes)", path, MaxPersonaFileSize)
}
return parsePersona(data, path) return parsePersona(data, path)
} }
// LoadBuiltinPersona loads a built-in persona by name. // LoadBuiltinPersona loads a built-in persona by name.
// Returns an error if the persona doesn't exist. // Returns an error if the persona doesn't exist.
// Built-in personas are stored in YAML format only (see embed directive).
func LoadBuiltinPersona(name string) (*Persona, error) { func LoadBuiltinPersona(name string) (*Persona, error) {
filename := name + ".json" yamlFile := name + ".yaml"
data, err := embeddedPersonas.ReadFile("personas/" + filename) // embed.FS paths use forward slashes per io/fs spec data, err := embeddedPersonas.ReadFile("personas/" + yamlFile)
if err != nil { if err != nil {
available := ListBuiltinPersonas() available := ListBuiltinPersonas()
return nil, fmt.Errorf("unknown built-in persona %q (available: %s)", name, strings.Join(available, ", ")) return nil, fmt.Errorf("unknown built-in persona %q (available: %s)", name, strings.Join(available, ", "))
} }
return parsePersona(data, "builtin:"+name) return parsePersona(data, "builtin:"+yamlFile)
} }
// ListBuiltinPersonas returns the names of all built-in personas. // ListBuiltinPersonas returns the names of all built-in personas in sorted order.
// Returns an empty slice if the embedded directory cannot be read. // Returns an empty slice if the embedded directory cannot be read.
func ListBuiltinPersonas() []string { func ListBuiltinPersonas() []string {
entries, err := embeddedPersonas.ReadDir("personas") entries, err := embeddedPersonas.ReadDir("personas")
if err != nil { if err != nil {
return []string{} return []string{}
} }
var names []string seen := make(map[string]bool)
for _, e := range entries { for _, e := range entries {
if e.IsDir() { if e.IsDir() {
continue continue
} }
name := e.Name() name := e.Name()
if strings.HasSuffix(name, ".json") { // Strip extension to get persona name
names = append(names, strings.TrimSuffix(name, ".json")) var personaName string
switch {
case strings.HasSuffix(name, ".yaml"):
personaName = strings.TrimSuffix(name, ".yaml")
case strings.HasSuffix(name, ".yml"):
personaName = strings.TrimSuffix(name, ".yml")
case strings.HasSuffix(name, ".json"):
personaName = strings.TrimSuffix(name, ".json")
default:
continue
}
if !seen[personaName] {
seen[personaName] = true
} }
} }
names := make([]string, 0, len(seen))
for name := range seen {
names = append(names, name)
}
sort.Strings(names)
return names return names
} }
// parsePersona parses persona data from JSON or YAML format.
// Format is detected by the source file extension.
func parsePersona(data []byte, source string) (*Persona, error) { func parsePersona(data []byte, source string) (*Persona, error) {
lowerSource := strings.ToLower(source)
isYAML := strings.HasSuffix(lowerSource, ".yaml") || strings.HasSuffix(lowerSource, ".yml")
var p Persona var p Persona
if err := json.Unmarshal(data, &p); err != nil { var err error
if isYAML {
err = unmarshalYAMLWithDepthLimit(data, &p, MaxYAMLDepth)
} else {
// Use json.Decoder with DisallowUnknownFields for consistency with
// YAML's KnownFields(true) - both reject unknown fields to catch typos.
dec := json.NewDecoder(bytes.NewReader(data))
dec.DisallowUnknownFields()
err = dec.Decode(&p)
}
if err != nil {
return nil, fmt.Errorf("parse persona %s: %w", source, err) return nil, fmt.Errorf("parse persona %s: %w", source, err)
} }
if err := validatePersona(&p, source); err != nil { if err := validatePersona(&p, source); err != nil {
@@ -84,6 +156,81 @@ func parsePersona(data []byte, source string) (*Persona, error) {
return &p, nil return &p, nil
} }
// unmarshalYAMLWithDepthLimit unmarshals YAML data with explicit depth limiting
// and strict field checking. This protects against stack exhaustion from deeply
// nested structures and catches typos in field names.
// Multi-document YAML files are rejected to prevent silent data loss.
func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
// First pass: decode into a yaml.Node to check depth limits and node counts.
// This prevents stack exhaustion before we attempt to decode into structs.
var node yaml.Node
dec := yaml.NewDecoder(bytes.NewReader(data))
if err := dec.Decode(&node); err != nil {
return err
}
// Reject multi-document YAML files - silently ignoring additional documents
// could lead to confusing behavior where users think their changes take effect.
var extra yaml.Node
if dec.Decode(&extra) == nil {
return fmt.Errorf("multi-document YAML is not supported; only single-document files are allowed")
}
nodeCount := 0
if err := checkYAMLDepth(&node, 0, maxDepth, MaxYAMLNodes, make(map[*yaml.Node]struct{}), &nodeCount); err != nil {
return err
}
// Second pass: decode with strict field checking enabled.
// KnownFields(true) rejects unknown keys, catching typos like "focuss" or "identiy".
// We must re-decode from the original data because yaml.Node.Decode() doesn't
// support the KnownFields option.
strictDec := yaml.NewDecoder(bytes.NewReader(data))
strictDec.KnownFields(true)
return strictDec.Decode(out)
}
// checkYAMLDepth recursively checks that YAML nodes don't exceed the depth limit
// or the total node count limit. It also detects alias cycles to prevent infinite
// recursion from crafted YAML with self-referential aliases.
func checkYAMLDepth(node *yaml.Node, depth, maxDepth, maxNodes int, seen map[*yaml.Node]struct{}, nodeCount *int) error {
if depth > maxDepth {
return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth)
}
// Track total nodes visited as defense-in-depth against wide-but-shallow attacks.
*nodeCount++
if *nodeCount > maxNodes {
return fmt.Errorf("YAML node count exceeds maximum (%d)", maxNodes)
}
// Cycle detection: if we've seen this node before, we're in a cycle.
if _, ok := seen[node]; ok {
return nil // Already validated this subtree, skip to avoid infinite recursion.
}
seen[node] = struct{}{}
// Handle alias nodes: follow the alias to its anchor target.
// Increment depth when following aliases since they expand the effective structure.
if node.Kind == yaml.AliasNode && node.Alias != nil {
return checkYAMLDepth(node.Alias, depth+1, maxDepth, maxNodes, seen, nodeCount)
}
for _, child := range node.Content {
if err := checkYAMLDepth(child, depth+1, maxDepth, maxNodes, seen, nodeCount); err != nil {
return err
}
}
return nil
}
// ParsePersonaBytes parses persona data from bytes with a source label for errors.
// This is useful for parsing personas fetched from external sources (e.g., Gitea API)
// without requiring filesystem access. Format is detected by source extension.
func ParsePersonaBytes(data []byte, source string) (*Persona, error) {
return parsePersona(data, source)
}
func validatePersona(p *Persona, source string) error { func validatePersona(p *Persona, source string) error {
if p.Name == "" { if p.Name == "" {
return fmt.Errorf("persona %s: name is required", source) return fmt.Errorf("persona %s: name is required", source)
+548 -9
View File
@@ -1,10 +1,13 @@
package review package review
import ( import (
"fmt"
"os" "os"
"path/filepath" "path/filepath"
"strings" "strings"
"testing" "testing"
"gopkg.in/yaml.v3"
) )
func TestLoadBuiltinPersona(t *testing.T) { func TestLoadBuiltinPersona(t *testing.T) {
@@ -87,6 +90,83 @@ func TestListBuiltinPersonas(t *testing.T) {
} }
} }
func TestLoadPersonaFromYAMLFile(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test.yaml")
content := `# Test persona
name: test
display_name: Test Persona
identity: |
You are a test persona.
Multi-line identity works.
focus:
- testing
- validation
ignore:
- nothing
severity:
major: Big problems
minor: Small problems
nit: Tiny problems
`
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
p, err := LoadPersona(path)
if err != nil {
t.Fatalf("LoadPersona failed: %v", err)
}
if p.Name != "test" {
t.Errorf("Name = %q, want %q", p.Name, "test")
}
if p.DisplayName != "Test Persona" {
t.Errorf("DisplayName = %q, want %q", p.DisplayName, "Test Persona")
}
if len(p.Focus) != 2 {
t.Errorf("Focus len = %d, want 2", len(p.Focus))
}
if !strings.Contains(p.Identity, "Multi-line") {
t.Error("Identity should contain multi-line content")
}
}
func TestLoadPersonaFromYMLFile(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test.yml")
content := `name: test
display_name: Test YML
identity: Test identity
focus:
- testing
ignore: []
severity:
major: Big
minor: Small
nit: Tiny
`
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
p, err := LoadPersona(path)
if err != nil {
t.Fatalf("LoadPersona failed: %v", err)
}
if p.Name != "test" {
t.Errorf("Name = %q, want %q", p.Name, "test")
}
if p.DisplayName != "Test YML" {
t.Errorf("DisplayName = %q, want %q", p.DisplayName, "Test YML")
}
}
func TestLoadPersonaFromJSONFile(t *testing.T) { func TestLoadPersonaFromJSONFile(t *testing.T) {
dir := t.TempDir() dir := t.TempDir()
path := filepath.Join(dir, "test.json") path := filepath.Join(dir, "test.json")
@@ -96,6 +176,7 @@ func TestLoadPersonaFromJSONFile(t *testing.T) {
"display_name": "Test Persona", "display_name": "Test Persona",
"identity": "You are a test persona.\nMulti-line identity works.", "identity": "You are a test persona.\nMulti-line identity works.",
"focus": ["testing", "validation"], "focus": ["testing", "validation"],
"ignore": ["nothing"], "ignore": ["nothing"],
"severity": { "severity": {
"major": "Big problems", "major": "Big problems",
@@ -130,22 +211,38 @@ func TestLoadPersonaFromJSONFile(t *testing.T) {
func TestLoadPersonaValidation(t *testing.T) { func TestLoadPersonaValidation(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
json string content string
ext string
wantErr string wantErr string
}{ }{
{ {
name: "missing name", name: "missing name yaml",
json: `{"identity": "test"}`, content: "identity: test\n",
ext: ".yaml",
wantErr: "name is required", wantErr: "name is required",
}, },
{ {
name: "missing identity", name: "missing identity yaml",
json: `{"name": "test"}`, content: "name: test\n",
ext: ".yaml",
wantErr: "identity is required",
},
{
name: "missing name json",
content: `{"identity": "test"}`,
ext: ".json",
wantErr: "name is required",
},
{
name: "missing identity json",
content: `{"name": "test"}`,
ext: ".json",
wantErr: "identity is required", wantErr: "identity is required",
}, },
{ {
name: "display_name defaults to name", name: "display_name defaults to name",
json: `{"name": "test", "identity": "test identity"}`, content: "name: test\nidentity: test identity\n",
ext: ".yaml",
// No error expected - should succeed // No error expected - should succeed
}, },
} }
@@ -153,8 +250,8 @@ func TestLoadPersonaValidation(t *testing.T) {
for _, tt := range tests { for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) { t.Run(tt.name, func(t *testing.T) {
dir := t.TempDir() dir := t.TempDir()
path := filepath.Join(dir, "test.json") path := filepath.Join(dir, "test"+tt.ext)
if err := os.WriteFile(path, []byte(tt.json), 0644); err != nil { if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err) t.Fatalf("failed to write test file: %v", err)
} }
@@ -184,12 +281,25 @@ func TestLoadPersonaValidation(t *testing.T) {
} }
func TestLoadPersonaFileNotFound(t *testing.T) { func TestLoadPersonaFileNotFound(t *testing.T) {
_, err := LoadPersona("/nonexistent/path/persona.json") _, err := LoadPersona("/nonexistent/path/persona.yaml")
if err == nil { if err == nil {
t.Error("expected error for nonexistent file") t.Error("expected error for nonexistent file")
} }
} }
func TestLoadPersonaInvalidYAML(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "invalid.yaml")
if err := os.WriteFile(path, []byte("not valid yaml:\n - [broken"), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Error("expected error for invalid YAML")
}
}
func TestLoadPersonaInvalidJSON(t *testing.T) { func TestLoadPersonaInvalidJSON(t *testing.T) {
dir := t.TempDir() dir := t.TempDir()
path := filepath.Join(dir, "invalid.json") path := filepath.Join(dir, "invalid.json")
@@ -203,6 +313,38 @@ func TestLoadPersonaInvalidJSON(t *testing.T) {
} }
} }
func TestLoadPersonaCaseInsensitiveExtension(t *testing.T) {
tests := []struct {
name string
ext string
}{
{"lowercase yaml", ".yaml"},
{"uppercase YAML", ".YAML"},
{"mixed case Yaml", ".Yaml"},
{"lowercase yml", ".yml"},
{"uppercase YML", ".YML"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test"+tt.ext)
content := "name: test\nidentity: test identity\n"
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
p, err := LoadPersona(path)
if err != nil {
t.Fatalf("LoadPersona failed for extension %s: %v", tt.ext, err)
}
if p.Name != "test" {
t.Errorf("Name = %q, want %q", p.Name, "test")
}
})
}
}
func TestCapitalizeFirst(t *testing.T) { func TestCapitalizeFirst(t *testing.T) {
tests := []struct { tests := []struct {
input string input string
@@ -237,3 +379,400 @@ func TestListBuiltinPersonasReturnsEmptySlice(t *testing.T) {
t.Error("ListBuiltinPersonas should return empty slice, not nil") t.Error("ListBuiltinPersonas should return empty slice, not nil")
} }
} }
func TestYAMLMultilineStrings(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "multiline.yaml")
// Test literal block scalar (|) which preserves newlines
content := `name: multiline
display_name: Multiline Test
identity: |
First line.
Second line.
Third line.
focus:
- item one
ignore: []
severity:
major: Major issue
minor: Minor issue
nit: Nit
`
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
p, err := LoadPersona(path)
if err != nil {
t.Fatalf("LoadPersona failed: %v", err)
}
// Literal block scalar preserves newlines
if !strings.Contains(p.Identity, "\n") {
t.Error("Identity should contain newlines from literal block scalar")
}
if !strings.Contains(p.Identity, "Second line") {
t.Error("Identity should contain 'Second line'")
}
}
func TestYAMLComments(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "comments.yaml")
content := `# This is a comment
name: commented # inline comment
display_name: Commented Persona
# Another comment
identity: Test identity
focus:
- item # comment after item
ignore: []
severity:
major: Major
minor: Minor
nit: Nit
`
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
p, err := LoadPersona(path)
if err != nil {
t.Fatalf("LoadPersona failed: %v", err)
}
// Comments should be ignored
if p.Name != "commented" {
t.Errorf("Name = %q, want %q", p.Name, "commented")
}
if p.Focus[0] != "item" {
t.Errorf("Focus[0] = %q, want %q", p.Focus[0], "item")
}
}
func TestYAMLDeeplyNestedRejection(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "deeply-nested.yaml")
// Build a deeply nested YAML structure that exceeds MaxYAMLDepth (20).
// Each level adds 2 to the depth count (key + value mapping).
var sb strings.Builder
sb.WriteString("name: test\nidentity: test\nnested:\n")
indent := " "
for i := 0; i < 25; i++ {
sb.WriteString(strings.Repeat(indent, i+1))
sb.WriteString(fmt.Sprintf("level%d:\n", i))
}
sb.WriteString(strings.Repeat(indent, 26))
sb.WriteString("value: too-deep\n")
if err := os.WriteFile(path, []byte(sb.String()), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Error("expected error for deeply nested YAML, got nil")
}
if !strings.Contains(err.Error(), "nesting depth exceeds") {
t.Errorf("error = %q, want containing 'nesting depth exceeds'", err.Error())
}
}
func TestYAMLFileSizeLimit(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "huge.yaml")
// Create a file larger than MaxPersonaFileSize (64 KB)
content := "name: test\nidentity: " + strings.Repeat("x", MaxPersonaFileSize+1) + "\n"
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Error("expected error for oversized file, got nil")
}
if !strings.Contains(err.Error(), "exceeds maximum size") {
t.Errorf("error = %q, want containing 'exceeds maximum size'", err.Error())
}
}
func TestYAMLAliasCycleDetection(t *testing.T) {
// Test that our checkYAMLDepth function handles alias cycles gracefully
// by using the seen map to prevent infinite recursion.
// We test this directly because go-yaml's parser handles most cycles
// at parse time, but we need to ensure our checker is robust.
// Create a node structure where an alias points to a parent node,
// simulating what could happen with malicious input that bypasses
// go-yaml's cycle detection.
parent := &yaml.Node{
Kind: yaml.MappingNode,
Content: []*yaml.Node{
{Kind: yaml.ScalarNode, Value: "name"},
{Kind: yaml.ScalarNode, Value: "test"},
{Kind: yaml.ScalarNode, Value: "nested"},
},
}
// Create a child that aliases back to the parent (artificial cycle)
aliasToParent := &yaml.Node{
Kind: yaml.AliasNode,
Alias: parent,
}
parent.Content = append(parent.Content, aliasToParent)
nodeCount := 0
seen := make(map[*yaml.Node]struct{})
// This should NOT hang or stack overflow - the seen map prevents infinite recursion
err := checkYAMLDepth(parent, 0, MaxYAMLDepth, MaxYAMLNodes, seen, &nodeCount)
if err != nil {
t.Errorf("unexpected error traversing cyclic structure: %v", err)
}
// Verify we tracked the parent in the seen map
if _, ok := seen[parent]; !ok {
t.Error("parent node not tracked in seen map")
}
}
func TestYAMLMultiDocumentRejection(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "multi.yaml")
// Multi-document YAML (documents separated by ---)
content := `name: first
identity: first document
---
name: second
identity: second document
`
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Error("expected error for multi-document YAML, got nil")
}
if !strings.Contains(err.Error(), "multi-document") {
t.Errorf("error = %q, want containing 'multi-document'", err.Error())
}
}
func TestYAMLNodeCountLimit(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "wide.yaml")
// Build a YAML structure that's shallow but wide - many keys at the same level
// to test the node count limit (should exceed MaxYAMLNodes = 1000)
var sb strings.Builder
sb.WriteString("name: test\nidentity: test\n")
for i := 0; i < 600; i++ {
sb.WriteString(fmt.Sprintf("key%d: value%d\n", i, i))
}
if err := os.WriteFile(path, []byte(sb.String()), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Error("expected error for wide YAML exceeding node count, got nil")
}
if !strings.Contains(err.Error(), "node count exceeds") {
t.Errorf("error = %q, want containing 'node count exceeds'", err.Error())
}
}
func TestCheckYAMLDepthCycleDetectionDirect(t *testing.T) {
// Direct test of cycle detection in checkYAMLDepth by creating
// a node structure with an artificial cycle.
// This tests the seen map logic independent of go-yaml's parsing.
node := &yaml.Node{
Kind: yaml.MappingNode,
Content: []*yaml.Node{
{Kind: yaml.ScalarNode, Value: "key"},
{Kind: yaml.ScalarNode, Value: "value"},
},
}
// Create a cycle by making a child reference the parent
cycleChild := &yaml.Node{
Kind: yaml.AliasNode,
Alias: node, // Points back to the parent
}
node.Content = append(node.Content,
&yaml.Node{Kind: yaml.ScalarNode, Value: "cyclic"},
cycleChild,
)
nodeCount := 0
seen := make(map[*yaml.Node]struct{})
err := checkYAMLDepth(node, 0, MaxYAMLDepth, MaxYAMLNodes, seen, &nodeCount)
// Should complete without infinite recursion due to cycle detection
if err != nil {
t.Errorf("unexpected error: %v", err)
}
// The seen map should contain multiple entries
if len(seen) < 2 {
t.Errorf("seen map has %d entries, expected at least 2", len(seen))
}
}
func TestListBuiltinPersonasSortedOrder(t *testing.T) {
names := ListBuiltinPersonas()
if len(names) < 2 {
t.Skip("need at least 2 personas to test ordering")
}
// Verify the list is sorted
for i := 1; i < len(names); i++ {
if names[i-1] > names[i] {
t.Errorf("ListBuiltinPersonas not sorted: %q > %q", names[i-1], names[i])
}
}
}
func TestYAMLUnknownFieldsRejected(t *testing.T) {
tests := []struct {
name string
content string
wantErr string
}{
{
name: "unknown top-level field",
content: `name: test
identity: test identity
unknown_field: should fail
`,
wantErr: "unknown_field",
},
{
name: "typo in field name",
content: `name: test
identiy: typo should fail
`,
wantErr: "identiy",
},
{
name: "unknown field in severity",
content: `name: test
identity: test
severity:
major: Major
minro: typo
`,
wantErr: "minro",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "unknown.yaml")
if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Errorf("expected error for unknown field %q, got nil", tt.wantErr)
return
}
if !strings.Contains(err.Error(), tt.wantErr) {
t.Errorf("error = %q, want containing %q", err.Error(), tt.wantErr)
}
})
}
}
func TestJSONUnknownFieldsRejected(t *testing.T) {
tests := []struct {
name string
content string
wantErr string
}{
{
name: "unknown top-level field",
content: `{
"name": "test",
"identity": "test identity",
"unknown_field": "should fail"
}`,
wantErr: "unknown_field",
},
{
name: "typo in field name",
content: `{
"name": "test",
"identiy": "typo should fail"
}`,
wantErr: "identiy",
},
{
name: "unknown field in severity",
content: `{
"name": "test",
"identity": "test",
"severity": {
"major": "ok",
"miner": "typo"
}
}`,
wantErr: "miner",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test.json")
if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for unknown field, got nil")
}
if !strings.Contains(err.Error(), tt.wantErr) {
t.Errorf("error = %q, want to contain %q", err.Error(), tt.wantErr)
}
})
}
}
func TestLoadPersonaSymlink(t *testing.T) {
// Create a regular persona file
dir := t.TempDir()
realFile := filepath.Join(dir, "real.yaml")
content := `name: test
identity: test identity
`
if err := os.WriteFile(realFile, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
// Create a symlink to it
symlink := filepath.Join(dir, "link.yaml")
if err := os.Symlink(realFile, symlink); err != nil {
t.Fatalf("failed to create symlink: %v", err)
}
// LoadPersona should work via symlink
p, err := LoadPersona(symlink)
if err != nil {
t.Fatalf("LoadPersona via symlink failed: %v", err)
}
if p.Name != "test" {
t.Errorf("Name = %q, want %q", p.Name, "test")
}
}
-26
View File
@@ -1,26 +0,0 @@
{
"name": "architect",
"display_name": "Software Architect",
"identity": "You are a software architect reviewing code for design quality.\n\nYour expertise:\n- Design patterns and anti-patterns\n- Code organization and module boundaries\n- API design and contracts\n- Testability and dependency injection\n- Consistency with existing architecture\n- Technical debt identification",
"focus": [
"Design pattern violations or misuse",
"Module boundary violations (inappropriate coupling)",
"API design issues (unclear contracts, leaky abstractions)",
"Testability problems (hidden dependencies, god objects)",
"Inconsistency with existing codebase patterns",
"Unnecessary complexity or over-engineering",
"Missing abstractions or premature abstraction"
],
"ignore": [
"Security vulnerabilities (security persona handles these)",
"Performance micro-optimizations",
"Code style and formatting",
"Documentation typos",
"Test implementation details"
],
"severity": {
"major": "Architectural violations that will cause maintenance problems or make the codebase harder to evolve",
"minor": "Design issues that reduce clarity or testability but don't block progress",
"nit": "Minor pattern deviations or style preferences"
}
}
+37
View File
@@ -0,0 +1,37 @@
# Software Architect Persona
# Focuses on design quality, patterns, and code organization
name: architect
display_name: Software Architect
identity: |
You are a software architect reviewing code for design quality.
Your expertise:
- Design patterns and anti-patterns
- Code organization and module boundaries
- API design and contracts
- Testability and dependency injection
- Consistency with existing architecture
- Technical debt identification
focus:
- Design pattern violations or misuse
- Module boundary violations (inappropriate coupling)
- API design issues (unclear contracts, leaky abstractions)
- Testability problems (hidden dependencies, god objects)
- Inconsistency with existing codebase patterns
- Unnecessary complexity or over-engineering
- Missing abstractions or premature abstraction
ignore:
- Security vulnerabilities (security persona handles these)
- Performance micro-optimizations
- Code style and formatting
- Documentation typos
- Test implementation details
severity:
major: "Architectural violations that will cause maintenance problems or make the codebase harder to evolve"
minor: "Design issues that reduce clarity or testability but don't block progress"
nit: "Minor pattern deviations or style preferences"
-26
View File
@@ -1,26 +0,0 @@
{
"name": "docs",
"display_name": "Documentation Reviewer",
"identity": "You are a documentation specialist reviewing code for clarity and documentation quality.\n\nYour expertise:\n- API documentation and examples\n- Code comments and their accuracy\n- Error message clarity\n- README and guide quality\n- Naming clarity and self-documenting code",
"focus": [
"Missing or outdated documentation",
"Unclear or misleading comments",
"Poor error messages (cryptic, unhelpful, missing context)",
"Confusing naming (functions, variables, types)",
"Missing examples for complex APIs",
"Inconsistent terminology",
"Documentation that contradicts the code"
],
"ignore": [
"Security vulnerabilities",
"Performance issues",
"Design patterns",
"Test coverage",
"Code style (unless it affects readability)"
],
"severity": {
"major": "Documentation that actively misleads or missing docs for critical functionality",
"minor": "Unclear documentation or poor error messages that will confuse users",
"nit": "Minor clarity improvements or typo fixes"
}
}
+36
View File
@@ -0,0 +1,36 @@
# Documentation Reviewer Persona
# Focuses on clarity, documentation quality, and self-documenting code
name: docs
display_name: Documentation Reviewer
identity: |
You are a documentation specialist reviewing code for clarity and documentation quality.
Your expertise:
- API documentation and examples
- Code comments and their accuracy
- Error message clarity
- README and guide quality
- Naming clarity and self-documenting code
focus:
- Missing or outdated documentation
- Unclear or misleading comments
- Poor error messages (cryptic, unhelpful, missing context)
- Confusing naming (functions, variables, types)
- Missing examples for complex APIs
- Inconsistent terminology
- Documentation that contradicts the code
ignore:
- Security vulnerabilities
- Performance issues
- Design patterns
- Test coverage
- Code style (unless it affects readability)
severity:
major: "Documentation that actively misleads or missing docs for critical functionality"
minor: "Unclear documentation or poor error messages that will confuse users"
nit: "Minor clarity improvements or typo fixes"
-26
View File
@@ -1,26 +0,0 @@
{
"name": "security",
"display_name": "Security Specialist",
"identity": "You are a security specialist reviewing code for vulnerabilities.\n\nYour expertise:\n- OWASP Top 10 vulnerabilities\n- Injection attacks (SQL, command, path traversal, template)\n- Authentication and authorization patterns\n- Secrets management and exposure risks\n- Race conditions with security implications\n- Event sourcing attack vectors (replay attacks, event injection)",
"focus": [
"Injection attacks (SQL, command, path traversal, template injection)",
"Authentication and authorization gaps or bypasses",
"Secrets exposure (hardcoded credentials, tokens in logs, config leaks)",
"Input validation failures (unsanitized input, unsafe deserialization)",
"Race conditions that could be exploited",
"Cryptographic weaknesses (weak algorithms, improper key handling)",
"Information disclosure through error messages or logs"
],
"ignore": [
"Code style and naming conventions",
"Performance optimizations (unless security-related)",
"Documentation quality",
"General code quality or readability",
"Test coverage"
],
"severity": {
"major": "Exploitable vulnerabilities: auth bypass, injection, data exfiltration, privilege escalation, RCE",
"minor": "Defense-in-depth issues: missing rate limiting, verbose errors, weak input validation",
"nit": "Theoretical risks with low exploitability or impact"
}
}
+37
View File
@@ -0,0 +1,37 @@
# Security Specialist Persona
# Focuses on vulnerabilities, auth issues, and security best practices
name: security
display_name: Security Specialist
identity: |
You are a security specialist reviewing code for vulnerabilities.
Your expertise:
- OWASP Top 10 vulnerabilities
- Injection attacks (SQL, command, path traversal, template)
- Authentication and authorization patterns
- Secrets management and exposure risks
- Race conditions with security implications
- Event sourcing attack vectors (replay attacks, event injection)
focus:
- Injection attacks (SQL, command, path traversal, template injection)
- Authentication and authorization gaps or bypasses
- Secrets exposure (hardcoded credentials, tokens in logs, config leaks)
- Input validation failures (unsanitized input, unsafe deserialization)
- Race conditions that could be exploited
- Cryptographic weaknesses (weak algorithms, improper key handling)
- Information disclosure through error messages or logs
ignore:
- Code style and naming conventions
- Performance optimizations (unless security-related)
- Documentation quality
- General code quality or readability
- Test coverage
severity:
major: "Exploitable vulnerabilities: auth bypass, injection, data exfiltration, privilege escalation, RCE"
minor: "Defense-in-depth issues: missing rate limiting, verbose errors, weak input validation"
nit: "Theoretical risks with low exploitability or impact"
+150
View File
@@ -0,0 +1,150 @@
package review
import (
"context"
"log/slog"
"strings"
)
// RepoPersonaPath is the directory path where repo-specific personas are stored.
const RepoPersonaPath = ".review-bot/personas"
// GiteaClient defines the subset of gitea.Client methods needed for loading repo personas.
// This interface allows for easier testing and decouples the review package from gitea.
type GiteaClient interface {
ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error)
GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error)
}
// ContentEntry represents a file or directory entry from the contents API.
// This mirrors gitea.ContentEntry to avoid import cycles.
type ContentEntry struct {
Name string `json:"name"`
Path string `json:"path"`
Type string `json:"type"` // "file" or "dir"
}
// LoadRepoPersonas fetches personas from a repository's .review-bot/personas/ directory.
// Returns an empty map (not nil) if the directory doesn't exist or is empty.
// Individual parse failures are logged and skipped; the remaining personas are still returned.
// Auth errors and other non-404 errors are propagated.
// Files exceeding MaxPersonaFileSize are rejected to prevent resource exhaustion.
func LoadRepoPersonas(ctx context.Context, client GiteaClient, owner, repo string) (map[string]*Persona, error) {
result := make(map[string]*Persona)
entries, err := client.ListContents(ctx, owner, repo, RepoPersonaPath)
if err != nil {
// Check if this is a 404 (directory doesn't exist) - expected case
if isNotFoundError(err) {
slog.Debug("no repo personas directory found", "repo", owner+"/"+repo)
return result, nil
}
// Other errors (auth, server) should propagate
return nil, err
}
if len(entries) == 0 {
slog.Debug("repo personas directory is empty", "repo", owner+"/"+repo)
return result, nil
}
for _, entry := range entries {
if entry.Type != "file" {
continue
}
// Only process YAML files
if !isYAMLFile(entry.Name) {
continue
}
content, err := client.GetFileContent(ctx, owner, repo, entry.Path)
if err != nil {
slog.Warn("could not fetch repo persona file",
"file", entry.Path,
"repo", owner+"/"+repo,
"error", err)
continue
}
// Enforce size limit before parsing to prevent resource exhaustion
if len(content) > MaxPersonaFileSize {
slog.Warn("repo persona file exceeds maximum size",
"file", entry.Path,
"repo", owner+"/"+repo,
"size", len(content),
"max", MaxPersonaFileSize)
continue
}
persona, err := ParsePersonaBytes([]byte(content), entry.Path)
if err != nil {
slog.Warn("could not parse repo persona file",
"file", entry.Path,
"repo", owner+"/"+repo,
"error", err)
continue
}
result[persona.Name] = persona
slog.Debug("loaded repo persona",
"name", persona.Name,
"file", entry.Path,
"repo", owner+"/"+repo)
}
return result, nil
}
// MergePersonas combines built-in personas with repo personas.
// Repo personas take precedence on name collision.
// Returns a new map; inputs are not modified.
func MergePersonas(builtin, repo map[string]*Persona) map[string]*Persona {
result := make(map[string]*Persona, len(builtin)+len(repo))
// Copy built-in personas first
for name, p := range builtin {
result[name] = p
}
// Overlay repo personas (override on collision)
for name, p := range repo {
if _, exists := result[name]; exists {
slog.Debug("repo persona overrides built-in", "name", name)
}
result[name] = p
}
return result
}
// GetBuiltinPersonasMap returns all built-in personas as a map keyed by name.
// Returns an empty map (not nil) if loading fails.
func GetBuiltinPersonasMap() map[string]*Persona {
result := make(map[string]*Persona)
for _, name := range ListBuiltinPersonas() {
p, err := LoadBuiltinPersona(name)
if err != nil {
slog.Warn("could not load built-in persona", "name", name, "error", err)
continue
}
result[name] = p
}
return result
}
// isYAMLFile checks if a filename has a YAML extension.
func isYAMLFile(name string) bool {
lower := strings.ToLower(name)
return strings.HasSuffix(lower, ".yaml") || strings.HasSuffix(lower, ".yml")
}
// isNotFoundError checks if an error represents a 404 response.
// This uses a specific "HTTP 404" substring match rather than a generic "not found"
// match to avoid masking authentication failures or transport errors that might
// contain "not found" in their message.
func isNotFoundError(err error) bool {
if err == nil {
return false
}
return strings.Contains(err.Error(), "HTTP 404")
}
+443
View File
@@ -0,0 +1,443 @@
package review
import (
"context"
"errors"
"strings"
"testing"
)
func TestParsePersonaBytes(t *testing.T) {
tests := []struct {
name string
data string
source string
wantName string
wantErr string
}{
{
name: "valid yaml",
data: `name: test
identity: test identity
focus:
- testing
`,
source: "test.yaml",
wantName: "test",
},
{
name: "missing name",
data: "identity: test\n",
source: "test.yaml",
wantErr: "name is required",
},
{
name: "invalid yaml",
data: "not: valid:\n yaml: [broken",
source: "test.yaml",
wantErr: "parse",
},
{
name: "json format by extension",
data: `{"name": "jsontest", "identity": "json identity"}`,
source: "test.json",
wantName: "jsontest",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
p, err := ParsePersonaBytes([]byte(tt.data), tt.source)
if tt.wantErr != "" {
if err == nil {
t.Fatalf("expected error containing %q, got nil", tt.wantErr)
}
if !strings.Contains(err.Error(), tt.wantErr) {
t.Errorf("error = %q, want containing %q", err.Error(), tt.wantErr)
}
return
}
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if p.Name != tt.wantName {
t.Errorf("Name = %q, want %q", p.Name, tt.wantName)
}
})
}
}
// mockGiteaClient implements GiteaClient for testing.
type mockGiteaClient struct {
contents map[string][]ContentEntry // path -> entries
files map[string]string // path -> content
listErr error
fileErr map[string]error // path -> error
}
func (m *mockGiteaClient) ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error) {
if m.listErr != nil {
return nil, m.listErr
}
entries, ok := m.contents[path]
if !ok {
return nil, errors.New("list contents .review-bot/personas: HTTP 404: not found")
}
return entries, nil
}
func (m *mockGiteaClient) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
if m.fileErr != nil {
if err, ok := m.fileErr[filepath]; ok {
return "", err
}
}
content, ok := m.files[filepath]
if !ok {
return "", errors.New("HTTP 404: file not found")
}
return content, nil
}
func TestLoadRepoPersonas(t *testing.T) {
ctx := context.Background()
t.Run("directory not found returns empty map", func(t *testing.T) {
client := &mockGiteaClient{} // No contents configured -> 404
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if personas == nil {
t.Error("expected empty map, got nil")
}
if len(personas) != 0 {
t.Errorf("expected 0 personas, got %d", len(personas))
}
})
t.Run("empty directory returns empty map", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {},
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(personas) != 0 {
t.Errorf("expected 0 personas, got %d", len(personas))
}
})
t.Run("loads valid personas", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "trading.yaml", Path: ".review-bot/personas/trading.yaml", Type: "file"},
{Name: "crypto.yaml", Path: ".review-bot/personas/crypto.yaml", Type: "file"},
},
},
files: map[string]string{
".review-bot/personas/trading.yaml": `name: trading
display_name: Trading Expert
identity: You are a trading expert.
focus:
- order handling
- risk management
`,
".review-bot/personas/crypto.yaml": `name: crypto
display_name: Crypto Expert
identity: You are a cryptography expert.
focus:
- key management
- encryption
`,
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(personas) != 2 {
t.Fatalf("expected 2 personas, got %d", len(personas))
}
if personas["trading"] == nil {
t.Error("expected trading persona")
}
if personas["crypto"] == nil {
t.Error("expected crypto persona")
}
if personas["trading"].DisplayName != "Trading Expert" {
t.Errorf("trading display name = %q, want %q", personas["trading"].DisplayName, "Trading Expert")
}
})
t.Run("skips invalid persona files", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "valid.yaml", Path: ".review-bot/personas/valid.yaml", Type: "file"},
{Name: "invalid.yaml", Path: ".review-bot/personas/invalid.yaml", Type: "file"},
},
},
files: map[string]string{
".review-bot/personas/valid.yaml": `name: valid
identity: Valid persona
`,
".review-bot/personas/invalid.yaml": "not valid yaml: [broken",
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
// Should have the valid one, skip the invalid
if len(personas) != 1 {
t.Fatalf("expected 1 persona (skipped invalid), got %d", len(personas))
}
if personas["valid"] == nil {
t.Error("expected valid persona")
}
})
t.Run("skips non-yaml files", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "persona.yaml", Path: ".review-bot/personas/persona.yaml", Type: "file"},
{Name: "README.md", Path: ".review-bot/personas/README.md", Type: "file"},
{Name: "notes.txt", Path: ".review-bot/personas/notes.txt", Type: "file"},
},
},
files: map[string]string{
".review-bot/personas/persona.yaml": `name: test
identity: Test persona
`,
".review-bot/personas/README.md": "# Personas\n\nPut your personas here.",
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(personas) != 1 {
t.Fatalf("expected 1 persona (yaml only), got %d", len(personas))
}
})
t.Run("skips subdirectories", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "persona.yaml", Path: ".review-bot/personas/persona.yaml", Type: "file"},
{Name: "subdir", Path: ".review-bot/personas/subdir", Type: "dir"},
},
},
files: map[string]string{
".review-bot/personas/persona.yaml": `name: test
identity: Test persona
`,
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(personas) != 1 {
t.Fatalf("expected 1 persona (files only), got %d", len(personas))
}
})
t.Run("propagates auth errors", func(t *testing.T) {
client := &mockGiteaClient{
listErr: errors.New("HTTP 401: unauthorized"),
}
_, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err == nil {
t.Fatal("expected error for auth failure")
}
if !strings.Contains(err.Error(), "401") {
t.Errorf("error = %q, want containing '401'", err.Error())
}
})
t.Run("skips files that fail to fetch", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "good.yaml", Path: ".review-bot/personas/good.yaml", Type: "file"},
{Name: "bad.yaml", Path: ".review-bot/personas/bad.yaml", Type: "file"},
},
},
files: map[string]string{
".review-bot/personas/good.yaml": `name: good
identity: Good persona
`,
},
fileErr: map[string]error{
".review-bot/personas/bad.yaml": errors.New("HTTP 500: internal server error"),
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(personas) != 1 {
t.Fatalf("expected 1 persona (skipped failed fetch), got %d", len(personas))
}
})
t.Run("skips oversized files", func(t *testing.T) {
// Create a content string that exceeds MaxPersonaFileSize (64KB)
oversizedContent := strings.Repeat("a", MaxPersonaFileSize+1)
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "normal.yaml", Path: ".review-bot/personas/normal.yaml", Type: "file"},
{Name: "huge.yaml", Path: ".review-bot/personas/huge.yaml", Type: "file"},
},
},
files: map[string]string{
".review-bot/personas/normal.yaml": `name: normal
identity: Normal sized persona
`,
".review-bot/personas/huge.yaml": oversizedContent,
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
// Should have the normal one, skip the oversized
if len(personas) != 1 {
t.Fatalf("expected 1 persona (skipped oversized), got %d", len(personas))
}
if personas["normal"] == nil {
t.Error("expected normal persona")
}
})
}
func TestMergePersonas(t *testing.T) {
builtin := map[string]*Persona{
"security": {Name: "security", Identity: "Built-in security"},
"docs": {Name: "docs", Identity: "Built-in docs"},
}
repo := map[string]*Persona{
"security": {Name: "security", Identity: "Repo security override"},
"trading": {Name: "trading", Identity: "Repo trading"},
}
merged := MergePersonas(builtin, repo)
t.Run("repo overrides builtin on collision", func(t *testing.T) {
if merged["security"].Identity != "Repo security override" {
t.Errorf("security identity = %q, want repo override", merged["security"].Identity)
}
})
t.Run("builtin preserved when no collision", func(t *testing.T) {
if merged["docs"].Identity != "Built-in docs" {
t.Errorf("docs identity = %q, want built-in", merged["docs"].Identity)
}
})
t.Run("repo-only persona added", func(t *testing.T) {
if merged["trading"] == nil {
t.Error("expected trading persona from repo")
}
if merged["trading"].Identity != "Repo trading" {
t.Errorf("trading identity = %q, want repo", merged["trading"].Identity)
}
})
t.Run("original maps not modified", func(t *testing.T) {
if builtin["trading"] != nil {
t.Error("builtin map was modified")
}
if len(repo) != 2 {
t.Error("repo map was modified")
}
})
}
func TestGetBuiltinPersonasMap(t *testing.T) {
personas := GetBuiltinPersonasMap()
if len(personas) == 0 {
t.Fatal("expected at least one built-in persona")
}
// Verify expected personas exist
expected := []string{"security", "architect", "docs"}
for _, name := range expected {
if personas[name] == nil {
t.Errorf("expected built-in persona %q", name)
}
}
// Verify personas are valid
for name, p := range personas {
if p.Name != name {
t.Errorf("persona %q has mismatched name %q", name, p.Name)
}
if p.Identity == "" {
t.Errorf("persona %q has empty identity", name)
}
}
}
func TestIsYAMLFile(t *testing.T) {
tests := []struct {
name string
want bool
}{
{"test.yaml", true},
{"test.yml", true},
{"test.YAML", true},
{"test.YML", true},
{"test.json", false},
{"test.md", false},
{"test.txt", false},
{"yaml", false},
{"yaml.md", false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := isYAMLFile(tt.name); got != tt.want {
t.Errorf("isYAMLFile(%q) = %v, want %v", tt.name, got, tt.want)
}
})
}
}
func TestIsNotFoundError(t *testing.T) {
tests := []struct {
err error
want bool
}{
{nil, false},
{errors.New("HTTP 404: not found"), true},
{errors.New("HTTP 404"), true},
// Intentionally false: generic "not found" could mask auth/transport errors.
// Only explicit HTTP 404 responses should be treated as "directory doesn't exist".
{errors.New("something not found"), false},
{errors.New("HTTP 401: unauthorized"), false},
{errors.New("connection refused"), false},
}
for _, tt := range tests {
name := "nil"
if tt.err != nil {
name = tt.err.Error()
}
t.Run(name, func(t *testing.T) {
if got := isNotFoundError(tt.err); got != tt.want {
t.Errorf("isNotFoundError(%v) = %v, want %v", tt.err, got, tt.want)
}
})
}
}
+127
View File
@@ -0,0 +1,127 @@
#!/usr/bin/env bash
# check-deps.sh - Enforces the strict dependency allowlist from CONVENTIONS.md
# Exit 1 if any unapproved import is found.
#
# Requires: Bash 4+ (for associative arrays), Go toolchain
#
# The allowlist is parsed from CONVENTIONS.md to maintain a single source of truth.
# Enforces Scope column: "test only" packages cannot appear in non-test code.
set -euo pipefail
# Check bash version
if ((BASH_VERSINFO[0] < 4)); then
echo "❌ Bash 4+ required (found ${BASH_VERSION})"
echo " On macOS: brew install bash"
exit 1
fi
CONVENTIONS_FILE="${1:-CONVENTIONS.md}"
if [ ! -f "$CONVENTIONS_FILE" ]; then
echo "❌ CONVENTIONS.md not found"
exit 1
fi
# Parse approved packages from CONVENTIONS.md table using awk (POSIX-compatible)
# Format: | `package` | use case | scope |
declare -A ALLOWED_PROD=()
declare -A ALLOWED_TEST=()
while IFS= read -r line; do
# Use awk to extract package and scope from table row
pkg=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]*`|`[[:space:]]*$/, "", $2); print $2}')
scope=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]+|[[:space:]]+$/, "", $4); print tolower($4)}')
if [ -n "$pkg" ] && [ "$pkg" != "Package" ] && [[ "$pkg" =~ ^[a-zA-Z] ]]; then
if [[ "$scope" == *"test"* ]]; then
ALLOWED_TEST["$pkg"]=1
else
ALLOWED_PROD["$pkg"]=1
fi
fi
done < <(grep '| `' "$CONVENTIONS_FILE" 2>/dev/null || true)
ALL_ALLOWED=("${!ALLOWED_PROD[@]}" "${!ALLOWED_TEST[@]}")
if [ ${#ALL_ALLOWED[@]} -eq 0 ]; then
echo "⚠️ No approved packages found in $CONVENTIONS_FILE"
echo " (This is fine if you want stdlib-only)"
fi
# Helper: check if import matches any package in an associative array (literal prefix, no glob)
matches_allowlist() {
local import="$1"
shift
local -n allowlist=$1
for allowed in "${!allowlist[@]}"; do
# Exact match
if [ "$import" = "$allowed" ]; then
return 0
fi
# Literal prefix match for subpackages: must match "pkg/" exactly
if [ "${import#"$allowed/"}" != "$import" ]; then
return 0
fi
done
return 1
}
# Get direct module dependencies from go.mod
DIRECT_IMPORTS=$(go list -m -f '{{if and (not .Indirect) (not .Main)}}{{.Path}}{{end}}' all 2>&1) || {
echo "❌ Failed to list dependencies: $DIRECT_IMPORTS"
exit 1
}
DIRECT_IMPORTS=$(echo "$DIRECT_IMPORTS" | grep -v '^$' || true)
if [ -z "$DIRECT_IMPORTS" ]; then
echo "✅ No external dependencies"
exit 0
fi
# Check ALL direct dependencies are in some allowlist
VIOLATIONS=""
while IFS= read -r import; do
[ -z "$import" ] && continue
if ! matches_allowlist "$import" ALLOWED_PROD && ! matches_allowlist "$import" ALLOWED_TEST; then
VIOLATIONS="${VIOLATIONS} - ${import} (not in allowlist)"$'\n'
fi
done <<< "$DIRECT_IMPORTS"
if [ -n "$VIOLATIONS" ]; then
echo "❌ UNAPPROVED DEPENDENCIES DETECTED"
echo ""
echo "The following imports are not in the allowlist:"
printf "%s" "$VIOLATIONS"
echo ""
echo "To add a dependency, update CONVENTIONS.md (requires Aaron's approval)"
exit 1
fi
# Enforce Scope: test-only packages must not appear in non-test code
# Get imports used by non-test code only (go list -deps without -test excludes test deps)
PROD_IMPORTS=$(go list -deps -f '{{if not .Standard}}{{.ImportPath}}{{end}}' ./... 2>/dev/null || true)
TEST_ONLY_IN_PROD=""
for test_pkg in "${!ALLOWED_TEST[@]}"; do
# Use word-boundary matching: exact match or followed by /
if echo "$PROD_IMPORTS" | grep -qE "^${test_pkg}(/|\$|$)"; then
TEST_ONLY_IN_PROD="${TEST_ONLY_IN_PROD} - ${test_pkg} (marked 'test only' but used in production code)"$'\n'
fi
done
if [ -n "$TEST_ONLY_IN_PROD" ]; then
echo "❌ TEST-ONLY DEPENDENCIES IN PRODUCTION CODE"
echo ""
printf "%s" "$TEST_ONLY_IN_PROD"
echo ""
echo "These packages are marked 'test only' in CONVENTIONS.md"
echo "and must only be imported from *_test.go files."
exit 1
fi
echo "✅ All dependencies are approved"
echo " Direct module deps: $(echo "$DIRECT_IMPORTS" | wc -l | tr -d ' ')"
echo " Production allowlist: ${#ALLOWED_PROD[@]}, Test-only allowlist: ${#ALLOWED_TEST[@]}"