# review-bot AI-powered code review bot for Gitea pull requests. Fetches diff + context, sends to an LLM, and posts a structured review (APPROVE / REQUEST_CHANGES) back to the PR. ## Features - **Multi-provider**: OpenAI-compatible, Anthropic Messages API, and SAP AI Core - **Context-aware**: Fetches full file content, conventions, language patterns, CI status - **Path-scoped docs**: `doc-map` config injects only the governing design docs for changed paths - **Smart budget**: Automatically trims context to fit model token limits - **Idempotent reviews**: Posts new review, then cleans up stale ones (one review per bot) - **Custom prompts**: Load additional instructions from a file (e.g. security-focused review) - **Minimal dependencies**: Go stdlib + `github.com/goccy/go-yaml` only ## Quick Start: Composite Action The easiest way to use review-bot in your Gitea CI: ```yaml # .gitea/workflows/review.yml name: Review on: pull_request: types: [opened, synchronize] jobs: review: runs-on: ubuntu-24.04 steps: - uses: actions/checkout@v4 - uses: https://gitea.weiker.me/rodin/review-bot/.gitea/actions/review@v0.1.0 with: reviewer-token: ${{ secrets.REVIEW_TOKEN }} reviewer-name: code-review llm-base-url: ${{ secrets.LLM_BASE_URL }} llm-api-key: ${{ secrets.LLM_API_KEY }} llm-model: gpt-4.1 ``` That's it. Every PR gets an automated review. ## Examples ### Single reviewer with conventions ```yaml jobs: review: runs-on: ubuntu-24.04 steps: - uses: actions/checkout@v4 - uses: https://gitea.weiker.me/rodin/review-bot/.gitea/actions/review@v0.1.0 with: reviewer-token: ${{ secrets.REVIEW_TOKEN }} reviewer-name: reviewer llm-base-url: ${{ secrets.LLM_BASE_URL }} llm-api-key: ${{ secrets.LLM_API_KEY }} llm-model: gpt-4.1 conventions-file: CONVENTIONS.md timeout: '600' ``` ### Two reviewers with different models (diversity of opinion) ```yaml jobs: review: runs-on: ubuntu-24.04 strategy: matrix: include: - name: gpt model: gpt-4.1 token_secret: GPT_REVIEW_TOKEN - name: claude model: claude-sonnet-4-20250514 token_secret: CLAUDE_REVIEW_TOKEN provider: anthropic steps: - uses: actions/checkout@v4 - uses: https://gitea.weiker.me/rodin/review-bot/.gitea/actions/review@v0.1.0 with: reviewer-token: ${{ secrets[matrix.token_secret] }} reviewer-name: ${{ matrix.name }} llm-base-url: ${{ secrets.LLM_BASE_URL }} llm-api-key: ${{ secrets.LLM_API_KEY }} llm-model: ${{ matrix.model }} llm-provider: ${{ matrix.provider }} conventions-file: CONVENTIONS.md ``` Each reviewer posts independently and only cleans up its own stale reviews. ### Multiple review types from a single bot account Use the same Gitea token but different `reviewer-name` values to run specialized reviews without needing multiple bot accounts: ```yaml jobs: review: runs-on: ubuntu-24.04 strategy: matrix: include: - name: code-quality model: gpt-4.1 - name: security model: gpt-4.1 system_prompt_file: .review/SECURITY.md - name: performance model: gpt-4.1 system_prompt_file: .review/PERFORMANCE.md steps: - uses: actions/checkout@v4 - uses: https://gitea.weiker.me/rodin/review-bot/.gitea/actions/review@v0.1.0 with: reviewer-token: ${{ secrets.REVIEW_TOKEN }} reviewer-name: ${{ matrix.name }} llm-base-url: ${{ secrets.LLM_BASE_URL }} llm-api-key: ${{ secrets.LLM_API_KEY }} llm-model: ${{ matrix.model }} system-prompt-file: ${{ matrix.system_prompt_file }} ``` The sentinel `` ensures the security review only replaces previous security reviews, never the code-quality or performance reviews. ### With language patterns from another repo ```yaml - uses: https://gitea.weiker.me/rodin/review-bot/.gitea/actions/review@v0.1.0 with: reviewer-token: ${{ secrets.REVIEW_TOKEN }} reviewer-name: reviewer llm-base-url: ${{ secrets.LLM_BASE_URL }} llm-api-key: ${{ secrets.LLM_API_KEY }} llm-model: gpt-4.1 conventions-file: CLAUDE.md patterns-repo: rodin/go-patterns,rodin/kubernetes-conventions patterns-files: "README.md,patterns/" ``` Pattern repos are fetched at review time. The reviewer uses them as criteria for idiomatic code. ### Dry run (test without posting) ```yaml - uses: https://gitea.weiker.me/rodin/review-bot/.gitea/actions/review@v0.1.0 with: reviewer-token: ${{ secrets.REVIEW_TOKEN }} reviewer-name: test llm-base-url: ${{ secrets.LLM_BASE_URL }} llm-api-key: ${{ secrets.LLM_API_KEY }} llm-model: gpt-4.1 dry-run: 'true' ``` Prints the review to CI logs without posting to the PR. Useful for testing prompt changes. ### Using Anthropic directly ```yaml - uses: https://gitea.weiker.me/rodin/review-bot/.gitea/actions/review@v0.1.0 with: reviewer-token: ${{ secrets.REVIEW_TOKEN }} reviewer-name: claude llm-base-url: https://api.anthropic.com llm-api-key: ${{ secrets.ANTHROPIC_API_KEY }} llm-model: claude-sonnet-4-20250514 llm-provider: anthropic ``` ### Using SAP AI Core For SAP environments with AI Core deployments, use the `aicore` provider for native authentication: ```yaml - uses: https://gitea.weiker.me/rodin/review-bot/.gitea/actions/review@v0.1.0 with: reviewer-token: ${{ secrets.REVIEW_TOKEN }} reviewer-name: aicore-review llm-model: anthropic--claude-4.6-sonnet # or gpt-5 llm-provider: aicore aicore-client-id: ${{ secrets.AICORE_CLIENT_ID }} aicore-client-secret: ${{ secrets.AICORE_CLIENT_SECRET }} aicore-auth-url: ${{ secrets.AICORE_AUTH_URL }} aicore-api-url: ${{ secrets.AICORE_API_URL }} aicore-resource-group: default ``` AI Core handles OAuth token management and deployment discovery automatically. Model names must match the deployment name in AI Core (e.g. `anthropic--claude-4.6-sonnet`, `gpt-5`). ## Action Inputs | Input | Required | Default | Description | |-------|----------|---------|-------------| | `reviewer-token` | Yes | — | Gitea token for posting reviews (needs `write:issue`, `write:repository`) | | `reviewer-name` | No | `""` | Logical identity for this reviewer. Used as sentinel for idempotent cleanup. Set this when running multiple review bots on the same PR. | | `llm-base-url` | No* | `""` | LLM API base URL (required unless using aicore provider) | | `llm-api-key` | No* | `""` | LLM API key (required unless using aicore provider) | | `llm-model` | Yes | — | Model name | | `llm-provider` | No | `openai` | API provider: `openai`, `anthropic`, or `aicore` | | `aicore-client-id` | No** | `""` | SAP AI Core client ID | | `aicore-client-secret` | No** | `""` | SAP AI Core client secret | | `aicore-auth-url` | No** | `""` | SAP AI Core authentication URL | | `aicore-api-url` | No** | `""` | SAP AI Core API URL | | `aicore-resource-group` | No | `default` | SAP AI Core resource group | | `conventions-file` | No | `""` | Path to coding conventions file in the repo | | `patterns-repo` | No | `""` | Comma-separated repos with language patterns (e.g. `rodin/go-patterns`) | | `patterns-files` | No | `README.md` | Files/directories to fetch from pattern repos | | `system-prompt-file` | No | `""` | Local file with additional system prompt instructions | | `doc-map` | No | `""` | Path to a YAML file mapping source path globs to governing design docs | | `doc-map-max-bytes` | No | `102400` | Maximum bytes of injected doc content from doc-map (default 100KB) | | `persona` | No | `""` | Built-in persona name (security, architect, docs) | | `persona-file` | No | `""` | Path to persona file (YAML or JSON) with custom review focus | | `temperature` | No | `0` | LLM temperature (0 = server default) | | `timeout` | No | `300` | LLM request timeout in seconds | | `dry-run` | No | `false` | Print review to stdout instead of posting | | `update-existing` | No | `true` | Delete previous review from same bot before posting. Accepts: true/1/yes or false/0/no | | `version` | No | `latest` | review-bot version to install | *Required for `openai` and `anthropic` providers, not for `aicore`. **Required only for `aicore` provider. ## Runner Requirements The composite action requires these tools on the runner: | Tool | Used For | |------|----------| | `python3` | JSON parsing during version detection | | `sha256sum` | Checksum verification of downloaded binary | | `curl` | Downloading releases and querying the API | All three are pre-installed on `ubuntu-*` runners (e.g. `ubuntu-24.04`). If you use a custom runner image, ensure these are available. ## How Review Cleanup Works When `reviewer-name` is set, the bot embeds a hidden sentinel in each review: ```html ``` On the next run, it finds and deletes any review containing its own sentinel (except the one it just posted). This means: - **One review per bot per PR** — no clutter from repeated pushes - **Multiple bots coexist** — each only cleans up its own reviews - **Same token, different roles** — a single bot account can post "code-review" and "security" reviews without conflict - **No extra permissions** — identity comes from the sentinel, not the API If `reviewer-name` is empty, cleanup is skipped (reviews stack like before). ### Shared Token: Worst-Wins Behavior When multiple review types share the same Gitea bot account (e.g. code-quality and security), Gitea determines the user's approval state from their **most recent review**. This creates a race condition: if security finds issues (REQUEST_CHANGES) but code-quality finishes last (APPROVE), the PR appears approved. review-bot handles this automatically with **worst-wins reconciliation**: before posting, each job checks whether any sibling review from the same user already has REQUEST_CHANGES. If so and this job would post APPROVE, it posts as REQUEST_CHANGES instead — maintaining the block. This ensures the PR stays blocked until all checks pass, regardless of execution order. **If you need independent approval/block per review type**, use separate Gitea bot accounts with their own tokens. ## Custom Review Prompts Use `system-prompt-file` to specialize the review focus. The file contents are appended to the base system prompt as "Additional Review Instructions." Example `SECURITY_REVIEW.md`: ```markdown You are performing a security-focused code review. Focus areas: - Injection attacks (SQL, command, path traversal, template) - Authentication/Authorization (missing checks, privilege escalation) - Secrets exposure (hardcoded credentials, tokens in logs) - Input validation (unsanitized input, unsafe deserialization) - Race conditions (TOCTOU, unsynchronized shared state) Rules: - Only report findings with security implications - Ignore style, naming, and general code quality - MAJOR = exploitable vulnerability, MINOR = hardening opportunity, NIT = theoretical risk - If no security-relevant changes exist, APPROVE with empty findings ``` ## CLI Usage ```bash review-bot \ --vcs-url https://gitea.example.com \ --repo owner/name \ --pr 42 \ --reviewer-token "$GITEA_TOKEN" \ --reviewer-name "code-review" \ --llm-base-url https://api.openai.com/v1 \ --llm-api-key "$OPENAI_API_KEY" \ --llm-model gpt-4.1 \ --conventions-file CONVENTIONS.md ``` ## Environment Variables All flags have environment variable equivalents: | Flag | Env Var | |------|---------| | `--vcs-url` | `VCS_URL` (fallback: `GITEA_URL`) | | `--repo` | `GITEA_REPO` | | `--pr` | `PR_NUMBER` | | `--reviewer-token` | `REVIEWER_TOKEN` | | `--reviewer-name` | `REVIEWER_NAME` | | `--llm-base-url` | `LLM_BASE_URL` | | `--llm-api-key` | `LLM_API_KEY` | | `--llm-model` | `LLM_MODEL` | | `--llm-provider` | `LLM_PROVIDER` | | `--conventions-file` | `CONVENTIONS_FILE` | | `--patterns-repo` | `PATTERNS_REPO` | | `--patterns-files` | `PATTERNS_FILES` | | `--system-prompt-file` | `SYSTEM_PROMPT_FILE` | | `--llm-temperature` | `LLM_TEMPERATURE` | | `--llm-timeout` | `LLM_TIMEOUT` | | `--update-existing` | `UPDATE_EXISTING` | ## Setup 1. **Create a Gitea bot account** (e.g. `review-bot`) 2. **Generate a token** with scopes: `write:issue`, `write:repository` 3. **Add secrets** to your Gitea repo (Settings → Actions → Secrets): - `REVIEW_TOKEN` — the bot's Gitea token - `LLM_BASE_URL` — your LLM endpoint - `LLM_API_KEY` — your LLM key 4. **Add the workflow** (see Quick Start above) ### Token Scopes Required | Scope | Purpose | |-------|--------| | `write:issue` | Post and delete reviews | | `write:repository` | Read PR diffs, file content, commit statuses | | `read:user` | Self-request as reviewer (optional but recommended) | Without `read:user`, the bot still works but cannot add itself to the PR's reviewer list. ## Development ```bash go test ./... # Unit tests go vet ./... # Static analysis go build -o review-bot ./cmd/review-bot # Integration tests (requires env vars set) go test -tags=integration ./... ``` ## Architecture ``` cmd/review-bot/ CLI entrypoint + orchestration gitea/ Gitea API client (reviews, PRs, files) llm/ Multi-provider LLM client (OpenAI + Anthropic) review/ Prompt building, response parsing, formatting budget/ Token estimation + context trimming ``` ## License MIT ## Review Personas Personas provide role-based review specialization. Instead of generic code review, each persona focuses on a specific domain (security, architecture, documentation) with tailored prompts and severity calibration. ### Built-in Personas | Persona | Focus | |---------|-------| | `security` | Vulnerabilities, auth bypass, secrets exposure, injection attacks | | `architect` | Design patterns, code organization, API contracts, testability | | `docs` | Documentation quality, API clarity, error messages | ### Using Built-in Personas ```yaml - uses: rodin/review-bot/.gitea/actions/review@v1 with: reviewer-name: security persona: security llm-model: claude-opus-4-20250514 # Security benefits from strong reasoning ... ``` ### Multiple Personas in Parallel ```yaml jobs: review: strategy: matrix: include: - name: security persona: security - name: architect persona: architect steps: - uses: rodin/review-bot/.gitea/actions/review@v1 with: reviewer-name: ${{ matrix.name }} persona: ${{ matrix.persona }} ... ``` Each persona posts independently with its own sentinel, so reviews don't interfere. ### Custom Personas Create a YAML file with your domain-specific review focus: ```yaml # .review/personas/trading.yaml name: trading display_name: Trading Domain Expert identity: | You are a trading systems expert reviewing code for correctness. Your expertise: - Order lifecycle and state machines - Fill handling and partial fills - Position tracking and P&L calculations - Event sourcing invariants focus: - Order state machine correctness - Fill handling edge cases (partial, overfill) - Position and P&L calculation accuracy - Event replay determinism - Decimal precision for money ignore: - Code style - General performance - Documentation formatting severity: major: "Bugs that cause incorrect positions, fills, or money calculations" minor: "Edge cases that could cause issues under unusual conditions" nit: "Clarity improvements for domain logic" ``` Use it in CI: ```yaml - uses: rodin/review-bot/.gitea/actions/review@v1 with: reviewer-name: trading persona-file: .review/personas/trading.yaml ... ``` YAML is the recommended format for personas because it supports: - Multi-line strings with `|` blocks (cleaner identity definitions) - Comments for documentation - More readable arrays and nested structures JSON is also supported for backwards compatibility—just use `.json` extension. ### Persona vs system-prompt-file | Feature | `persona` / `persona-file` | `system-prompt-file` | |---------|---------------------------|----------------------| | Replaces base prompt | Yes | No (appends) | | Structured format | Yes (YAML/JSON) | No (freeform) | | Focus/ignore lists | Yes | Manual | | Severity calibration | Yes | Manual | | Header display name | Yes | No | | Built-in options | Yes | No | Use personas for domain-specialized reviews. Use `system-prompt-file` for minor tweaks to the generic review.