Compare commits

...

10 Commits

Author SHA1 Message Date
Rodin 565a077b01 fix: CI config - correct patterns path, increase timeout
CI / test (pull_request) Successful in 13s
CI / review (gpt-4.1, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 21s
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 2m18s
- PATTERNS_FILES: docs/ does not exist in go-patterns, use patterns/
- LLM_TIMEOUT: 600s (gpt-5-mini needs more time for larger diffs)
2026-05-01 19:06:18 -07:00
Rodin dab7871cb4 fix: address review findings on budget system
CI / test (pull_request) Successful in 14s
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 1m41s
CI / review (gpt-5-mini, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 3m2s
- Account for truncation marker tokens when computing diff budget
  (prevents EstTokens exceeding model limit in edge cases)
- Rune-safe truncation for both UserMeta and Diff (no split multi-byte)
- Fix misleading comment (1000 chars → ~1000 tokens/4000 chars)
- Extract marker strings as constants
- Add unit tests for BuildSystemBase and BuildUserMeta
2026-05-01 18:59:07 -07:00
Rodin d9cacf6f62 fix: strict budget enforcement + deterministic model matching
CI / test (pull_request) Successful in 14s
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 1m59s
CI / review (gpt-5-mini, gpt, GPT_REVIEW_TOKEN) (pull_request) Failing after 5m12s
Addresses review findings:
- Replace map-based model limits with ordered slice (longest-prefix-first)
  for deterministic matching
- Truncate UserMeta when base content alone exceeds budget (keeps first
  4000 chars + truncation marker)
- Remove hard minimum of 1000 tokens for diff budget — use 0 as floor
  to guarantee total never exceeds limit
- Handle zero-budget edge case (diff replaced with manual-review message)
- Add tests: huge UserMeta, all-sections-huge never exceeds limit
2026-05-01 18:51:22 -07:00
Rodin 67d835909f feat: add context budget system for LLM overflow (#19)
CI / test (pull_request) Successful in 13s
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 1m30s
CI / review (gpt-5-mini, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m29s
Adds a budget package that estimates token usage and progressively
trims context to fit within model-specific limits.

Trim order (least important first):
1. Language patterns
2. Repository conventions
3. Full file context
4. Diff (truncated as last resort)

When content is trimmed, a note is appended to the user prompt so
the LLM knows context was reduced.

- New budget package with Fit(), EstimateTokens(), LimitForModel()
- Model limit table (GPT-4.1: 128K, GPT-5: 200K, Claude: 200K)
- Refactored review/prompt.go: BuildSystemBase() and BuildUserMeta()
  extract non-trimmable content; old functions delegate to new ones
- main.go uses budget.Fit() instead of direct prompt assembly
- 7 unit tests covering all trim paths

Closes #19
2026-05-01 18:46:53 -07:00
rodin ef3e6d5e87 Merge pull request 'fix: path-escape file paths and eliminate url package shadowing' (#17) from fix/url-escaping-and-shadow into main
CI / test (push) Successful in 15s
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5-mini, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
2026-05-01 21:55:02 +00:00
Rodin aade891129 docs: add package-level documentation
CI / test (pull_request) Successful in 14s
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Failing after 55s
CI / review (gpt-5-mini, gpt, GPT_REVIEW_TOKEN) (pull_request) Failing after 1m6s
Per go-patterns/package-design.md, every package needs a doc comment.
Added to gitea, llm, and review packages.
2026-05-01 14:54:58 -07:00
Rodin 7b42de67ca fix: handle empty path in ListContents (root listing)
CI / test (pull_request) Successful in 14s
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 1m4s
CI / review (gpt-5-mini, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m8s
Empty path now yields /contents instead of /contents/ (trailing slash).
Added doc comment noting empty path = repo root.
2026-05-01 14:46:40 -07:00
Rodin dd2661fe14 fix: address all review findings from PR #17
CI / test (pull_request) Successful in 13s
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 52s
CI / review (gpt-5-mini, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m12s
- Rename all remaining url locals to reqURL (consistency)
- Use http.MethodGet/http.MethodPost constants
- Document escapePath: relative paths only, double-encoding expected
- Add TestEscapePath with 7 edge cases (empty, spaces, #, deep, encoded)
2026-05-01 14:40:19 -07:00
Rodin 98a4772f30 fix: path-escape file paths and eliminate url package shadowing
CI / test (pull_request) Successful in 14s
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 55s
CI / review (gpt-5-mini, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m48s
- Add escapePath() helper: escapes each path segment individually
  (preserves slashes as separators, escapes spaces/#/? etc)
- Apply to GetFileContent, GetFileContentRef, ListContents
- Rename doGet parameter from url to reqURL (avoids shadowing net/url)
- Rename local variables in GetFileContent/ListContents for consistency

Addresses remaining findings from PR #16 review.
2026-05-01 14:33:18 -07:00
rodin fc23b6ebe9 Merge pull request 'fix: quick wins (#7, #9, #13)' (#16) from fix/quick-wins into main
CI / test (push) Successful in 14s
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5-mini, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
2026-05-01 21:30:57 +00:00
9 changed files with 580 additions and 28 deletions
+3 -2
View File
@@ -31,7 +31,7 @@ jobs:
model: gpt-5
- name: gpt
token_secret: GPT_REVIEW_TOKEN
model: gpt-5-mini
model: gpt-4.1
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
@@ -49,5 +49,6 @@ jobs:
LLM_MODEL: ${{ matrix.model }}
CONVENTIONS_FILE: "CONVENTIONS.md"
PATTERNS_REPO: "rodin/go-patterns"
PATTERNS_FILES: "README.md,docs/"
PATTERNS_FILES: "README.md,patterns/"
LLM_TIMEOUT: "600"
run: ./review-bot
+226
View File
@@ -0,0 +1,226 @@
// Package budget manages LLM context window budgeting for review-bot.
//
// It estimates token usage and progressively trims context content to fit
// within model-specific limits. The trimming order (least important first):
// patterns → conventions → file context → diff truncation.
package budget
import (
"fmt"
"strings"
)
// modelLimit pairs a model name prefix with its context window size.
type modelLimit struct {
prefix string
limit int
}
// Known model context limits (in tokens), ordered longest-prefix-first
// for deterministic matching.
var modelLimits = []modelLimit{
{"claude-haiku-3.5-20241022", 200_000},
{"claude-sonnet-4-20250514", 200_000},
{"claude-opus-4-20250514", 200_000},
{"gpt-4.1-mini", 128_000},
{"gpt-5-mini", 200_000},
{"gpt-4.1", 128_000},
{"gpt-5", 200_000},
}
const defaultLimit = 128_000
// reserveTokens is headroom for the response generation.
const reserveTokens = 4_000
const diffTruncMarker = "\n\n... [diff truncated due to context limit] ..."
const diffTooLargeMarker = "... [diff too large for context window — review manually] ..."
const userMetaTruncMarker = "\n... [description truncated] ..."
// EstimateTokens estimates the number of tokens in a string.
// Uses the rough heuristic of ~4 characters per token, which is
// conservative for English text and code.
func EstimateTokens(s string) int {
return len(s) / 4
}
// LimitForModel returns the context window size for the given model.
// Uses longest-prefix-first matching for deterministic results.
func LimitForModel(model string) int {
for _, ml := range modelLimits {
if model == ml.prefix || strings.HasPrefix(model, ml.prefix) {
return ml.limit
}
}
return defaultLimit
}
// Sections holds the prompt content sections in trim priority order.
// When the total exceeds the budget, sections are trimmed from least
// important (Patterns) to most important (Diff).
type Sections struct {
SystemBase string // Core instructions (never trimmed)
Patterns string // Language patterns (trimmed first)
Conventions string // Repo conventions (trimmed second)
FileContext string // Full file content (trimmed third)
Diff string // The actual diff (trimmed last, only truncated)
UserMeta string // PR title, description, CI status (never trimmed)
}
// Result holds the trimmed content and metadata about what was dropped.
type Result struct {
SystemPrompt string
UserPrompt string
Trimmed []string // Human-readable descriptions of what was trimmed
EstTokens int // Estimated total tokens after trimming
}
// Fit trims sections to fit within the model's context limit.
// Returns the assembled prompts and a list of what was trimmed.
func Fit(model string, sections Sections) Result {
limit := LimitForModel(model) - reserveTokens
baseTokens := EstimateTokens(sections.SystemBase) + EstimateTokens(sections.UserMeta)
available := limit - baseTokens
if available < 0 {
// Base content alone exceeds budget. Truncate UserMeta (keep first ~1000 tokens).
if len(sections.UserMeta) > 4000 {
sections.UserMeta = truncateUTF8(sections.UserMeta, 4000) + userMetaTruncMarker
baseTokens = EstimateTokens(sections.SystemBase) + EstimateTokens(sections.UserMeta)
available = limit - baseTokens
}
if available < 0 {
available = 0
}
}
// Trimmable sections in priority order (first = dropped first)
type entry struct {
name string
content *string
}
entries := []entry{
{"patterns", &sections.Patterns},
{"conventions", &sections.Conventions},
{"file context", &sections.FileContext},
}
// Check if everything fits
totalTrimmable := EstimateTokens(sections.Diff)
for _, e := range entries {
totalTrimmable += EstimateTokens(*e.content)
}
var trimmed []string
if totalTrimmable > available {
// Trim from least important
for i := range entries {
tokens := EstimateTokens(*entries[i].content)
if tokens == 0 {
continue
}
trimmed = append(trimmed, fmt.Sprintf("%s (~%dK tokens)", entries[i].name, tokens/1000))
*entries[i].content = ""
// Recalculate
totalTrimmable = EstimateTokens(sections.Diff)
for _, e := range entries {
totalTrimmable += EstimateTokens(*e.content)
}
if totalTrimmable <= available {
break
}
}
}
// If still too large, truncate the diff
if totalTrimmable > available {
diffBudget := available
for _, e := range entries {
diffBudget -= EstimateTokens(*e.content)
}
if diffBudget < 0 {
diffBudget = 0
}
// Reserve space for truncation marker
markerBudget := EstimateTokens(diffTruncMarker)
effectiveBudget := diffBudget - markerBudget
if effectiveBudget < 0 {
effectiveBudget = 0
}
maxChars := effectiveBudget * 4
if maxChars < len(sections.Diff) {
removed := EstimateTokens(sections.Diff) - diffBudget
trimmed = append(trimmed, fmt.Sprintf("diff truncated (~%dK tokens removed)", removed/1000))
if maxChars > 0 {
sections.Diff = truncateUTF8(sections.Diff, maxChars) + diffTruncMarker
} else {
sections.Diff = diffTooLargeMarker
}
}
}
finalTokens := baseTokens
for _, e := range entries {
finalTokens += EstimateTokens(*e.content)
}
finalTokens += EstimateTokens(sections.Diff)
return buildResult(sections, trimmed, finalTokens)
}
func buildResult(s Sections, trimmed []string, estTokens int) Result {
var sys strings.Builder
sys.WriteString(s.SystemBase)
if s.Patterns != "" {
sys.WriteString("\n\n## Language Patterns & Idioms\n\nUse the following patterns as review criteria. Code that violates these established patterns is a finding:\n\n")
sys.WriteString(s.Patterns)
}
if s.Conventions != "" {
sys.WriteString("\n\n## Repository Conventions\n\nThe repository has the following coding conventions that must be respected:\n\n")
sys.WriteString(s.Conventions)
}
var usr strings.Builder
usr.WriteString(s.UserMeta)
if s.FileContext != "" {
usr.WriteString("\n### Full File Context (modified files)\n\n")
usr.WriteString(s.FileContext)
usr.WriteString("\n")
}
usr.WriteString("\n### Diff (changes to review)\n\n```diff\n")
usr.WriteString(s.Diff)
usr.WriteString("\n```\n")
if len(trimmed) > 0 {
usr.WriteString("\n⚠️ Note: Context was trimmed to fit model limits. Dropped: ")
usr.WriteString(strings.Join(trimmed, ", "))
usr.WriteString("\n")
}
return Result{
SystemPrompt: sys.String(),
UserPrompt: usr.String(),
Trimmed: trimmed,
EstTokens: estTokens,
}
}
// truncateUTF8 truncates s to at most maxBytes without splitting multi-byte
// UTF-8 characters. Returns a valid UTF-8 string of at most maxBytes bytes.
func truncateUTF8(s string, maxBytes int) string {
if len(s) <= maxBytes {
return s
}
// Walk backwards from maxBytes to find a valid UTF-8 boundary
for maxBytes > 0 && !isUTF8Start(s[maxBytes]) {
maxBytes--
}
return s[:maxBytes]
}
// isUTF8Start returns true if b is a valid start byte for a UTF-8 sequence
// (single-byte ASCII or multi-byte lead byte, not a continuation byte).
func isUTF8Start(b byte) bool {
return b&0xC0 != 0x80
}
+203
View File
@@ -0,0 +1,203 @@
package budget
import (
"strings"
"testing"
)
func TestEstimateTokens(t *testing.T) {
tests := []struct {
input string
want int
}{
{"", 0},
{"abcd", 1},
{"12345678", 2},
{strings.Repeat("x", 400), 100},
}
for _, tt := range tests {
got := EstimateTokens(tt.input)
if got != tt.want {
t.Errorf("EstimateTokens(%d chars) = %d, want %d", len(tt.input), got, tt.want)
}
}
}
func TestLimitForModel(t *testing.T) {
tests := []struct {
model string
want int
}{
{"gpt-4.1", 128_000},
{"gpt-5", 200_000},
{"gpt-5-mini", 200_000},
{"unknown-model", defaultLimit},
{"gpt-4.1-2026-01-01", 128_000}, // prefix match
}
for _, tt := range tests {
got := LimitForModel(tt.model)
if got != tt.want {
t.Errorf("LimitForModel(%q) = %d, want %d", tt.model, got, tt.want)
}
}
}
func TestFit_AllFits(t *testing.T) {
s := Sections{
SystemBase: "system instructions",
Patterns: "some patterns",
Conventions: "some conventions",
FileContext: "file content",
Diff: "diff content",
UserMeta: "PR: title\n",
}
result := Fit("gpt-5", s)
if len(result.Trimmed) != 0 {
t.Errorf("expected no trimming, got %v", result.Trimmed)
}
if !strings.Contains(result.SystemPrompt, "some patterns") {
t.Error("expected patterns in system prompt")
}
if !strings.Contains(result.SystemPrompt, "some conventions") {
t.Error("expected conventions in system prompt")
}
if !strings.Contains(result.UserPrompt, "file content") {
t.Error("expected file context in user prompt")
}
}
func TestFit_TrimsPatterns(t *testing.T) {
// Create content that exceeds 128K token budget for gpt-4.1
// Budget ≈ 128K - 4K reserve = 124K tokens = ~496K chars
// Fill patterns with enough to push over
bigPatterns := strings.Repeat("x", 500_000) // ~125K tokens
s := Sections{
SystemBase: "base",
Patterns: bigPatterns,
Conventions: "conventions",
FileContext: "files",
Diff: "diff",
UserMeta: "meta",
}
result := Fit("gpt-4.1", s)
if len(result.Trimmed) == 0 {
t.Fatal("expected trimming")
}
if !strings.Contains(result.Trimmed[0], "patterns") {
t.Errorf("expected patterns to be trimmed first, got %v", result.Trimmed)
}
if strings.Contains(result.SystemPrompt, bigPatterns[:100]) {
t.Error("expected patterns to be removed from output")
}
// Conventions should survive
if !strings.Contains(result.SystemPrompt, "conventions") {
t.Error("expected conventions to survive after patterns trimmed")
}
}
func TestFit_TrimsConventions(t *testing.T) {
// Patterns + conventions + diff all exceed budget even after patterns removed
big := strings.Repeat("y", 520_000) // ~130K tokens each (exceeds 124K budget even alone)
s := Sections{
SystemBase: "base",
Patterns: big,
Conventions: big,
FileContext: "files",
Diff: "diff",
UserMeta: "meta",
}
result := Fit("gpt-4.1", s)
if len(result.Trimmed) < 2 {
t.Fatalf("expected at least 2 trimmed, got %v", result.Trimmed)
}
if !strings.Contains(result.Trimmed[0], "patterns") {
t.Errorf("expected patterns trimmed first, got %s", result.Trimmed[0])
}
if !strings.Contains(result.Trimmed[1], "conventions") {
t.Errorf("expected conventions trimmed second, got %s", result.Trimmed[1])
}
}
func TestFit_TruncatesDiff(t *testing.T) {
// Only diff is huge, no patterns/conventions
hugeDiff := strings.Repeat("z", 600_000) // ~150K tokens > 128K limit
s := Sections{
SystemBase: "base",
Diff: hugeDiff,
UserMeta: "meta",
}
result := Fit("gpt-4.1", s)
if len(result.Trimmed) == 0 {
t.Fatal("expected diff truncation")
}
if !strings.Contains(result.Trimmed[len(result.Trimmed)-1], "diff truncated") {
t.Errorf("expected diff truncation note, got %v", result.Trimmed)
}
if !strings.Contains(result.UserPrompt, "[diff truncated due to context limit]") {
t.Error("expected truncation marker in user prompt")
}
}
func TestFit_PreservesNoteInOutput(t *testing.T) {
big := strings.Repeat("w", 500_000)
s := Sections{
SystemBase: "base",
Patterns: big,
Diff: "small diff",
UserMeta: "meta",
}
result := Fit("gpt-4.1", s)
if !strings.Contains(result.UserPrompt, "⚠️ Note: Context was trimmed") {
t.Error("expected trimming note in user prompt")
}
}
func TestFit_HugeUserMeta(t *testing.T) {
// UserMeta so large that base alone exceeds limit
// Use a unique marker past the truncation point
hugeDesc := strings.Repeat("d", 5000) + "UNIQUE_MARKER_PAST_TRUNCATION" + strings.Repeat("d", 595_000)
s := Sections{
SystemBase: "base",
Diff: "small diff",
UserMeta: hugeDesc,
}
result := Fit("gpt-4.1", s)
limit := LimitForModel("gpt-4.1") - reserveTokens
if result.EstTokens > limit {
t.Errorf("EstTokens %d exceeds limit %d", result.EstTokens, limit)
}
// Content past truncation point should not be present
if strings.Contains(result.UserPrompt, "UNIQUE_MARKER_PAST_TRUNCATION") {
t.Error("expected UserMeta to be truncated but found content past truncation point")
}
// Truncation marker should be present
if !strings.Contains(result.UserPrompt, "[description truncated]") {
t.Error("expected truncation marker in output")
}
}
func TestFit_NeverExceedsLimit(t *testing.T) {
// All sections huge — verify final tokens never exceed limit
big := strings.Repeat("a", 200_000)
s := Sections{
SystemBase: strings.Repeat("s", 8000),
Patterns: big,
Conventions: big,
FileContext: big,
Diff: big,
UserMeta: strings.Repeat("m", 8000),
}
result := Fit("gpt-4.1", s)
limit := LimitForModel("gpt-4.1") - reserveTokens
if result.EstTokens > limit {
t.Errorf("EstTokens %d exceeds limit %d (trimmed: %v)", result.EstTokens, limit, result.Trimmed)
}
}
+17 -5
View File
@@ -10,6 +10,7 @@ import (
"strings"
"time"
"gitea.weiker.me/rodin/review-bot/budget"
"gitea.weiker.me/rodin/review-bot/gitea"
"gitea.weiker.me/rodin/review-bot/llm"
"gitea.weiker.me/rodin/review-bot/review"
@@ -141,15 +142,26 @@ func main() {
log.Printf("Loaded patterns from %s (%d bytes)", *patternsRepo, len(patterns))
}
// Step 7: Build prompts
systemPrompt := review.BuildSystemPrompt(conventions, patterns)
userPrompt := review.BuildUserPrompt(pr.Title, pr.Body, diff, fileContext, ciPassed, ciDetails)
// Step 7: Budget-aware prompt assembly
sections := budget.Sections{
SystemBase: review.BuildSystemBase(),
Patterns: patterns,
Conventions: conventions,
FileContext: fileContext,
Diff: diff,
UserMeta: review.BuildUserMeta(pr.Title, pr.Body, ciPassed, ciDetails),
}
budgetResult := budget.Fit(*llmModel, sections)
log.Printf("Token estimate: ~%dK (limit: %dK)", budgetResult.EstTokens/1000, budget.LimitForModel(*llmModel)/1000)
if len(budgetResult.Trimmed) > 0 {
log.Printf("Context trimmed: %v", budgetResult.Trimmed)
}
// Step 8: Call LLM
log.Printf("Sending to LLM (%s)...", *llmModel)
messages := []llm.Message{
{Role: "system", Content: systemPrompt},
{Role: "user", Content: userPrompt},
{Role: "system", Content: budgetResult.SystemPrompt},
{Role: "user", Content: budgetResult.UserPrompt},
}
response, err := llmClient.Complete(ctx, messages)
+38 -17
View File
@@ -1,3 +1,6 @@
// Package gitea provides a client for the Gitea API.
// It supports pull request operations, file content retrieval,
// and review submission.
package gitea
import (
@@ -56,8 +59,8 @@ type ChangedFile struct {
// GetPullRequest fetches PR metadata.
func (c *Client) GetPullRequest(ctx context.Context, owner, repo string, number int) (*PullRequest, error) {
url := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d", c.baseURL, owner, repo, number)
body, err := c.doGet(ctx, url)
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d", c.baseURL, owner, repo, number)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("fetch PR: %w", err)
}
@@ -70,8 +73,8 @@ func (c *Client) GetPullRequest(ctx context.Context, owner, repo string, number
// GetPullRequestDiff fetches the unified diff for a PR.
func (c *Client) GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error) {
url := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d.diff", c.baseURL, owner, repo, number)
body, err := c.doGet(ctx, url)
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d.diff", c.baseURL, owner, repo, number)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return "", fmt.Errorf("fetch diff: %w", err)
}
@@ -80,8 +83,8 @@ func (c *Client) GetPullRequestDiff(ctx context.Context, owner, repo string, num
// GetPullRequestFiles fetches the list of files changed in a PR.
func (c *Client) GetPullRequestFiles(ctx context.Context, owner, repo string, number int) ([]ChangedFile, error) {
url := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d/files", c.baseURL, owner, repo, number)
body, err := c.doGet(ctx, url)
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d/files", c.baseURL, owner, repo, number)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("fetch PR files: %w", err)
}
@@ -94,8 +97,8 @@ func (c *Client) GetPullRequestFiles(ctx context.Context, owner, repo string, nu
// GetCommitStatuses fetches CI statuses for a commit SHA.
func (c *Client) GetCommitStatuses(ctx context.Context, owner, repo, sha string) ([]CommitStatus, error) {
url := fmt.Sprintf("%s/api/v1/repos/%s/%s/commits/%s/statuses", c.baseURL, owner, repo, sha)
body, err := c.doGet(ctx, url)
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/commits/%s/statuses", c.baseURL, owner, repo, sha)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("fetch commit statuses: %w", err)
}
@@ -108,8 +111,8 @@ func (c *Client) GetCommitStatuses(ctx context.Context, owner, repo, sha string)
// GetFileContent fetches a file from the default branch of a repo.
func (c *Client) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
url := fmt.Sprintf("%s/api/v1/repos/%s/%s/raw/%s", c.baseURL, owner, repo, filepath)
body, err := c.doGet(ctx, url)
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/raw/%s", c.baseURL, owner, repo, escapePath(filepath))
body, err := c.doGet(ctx, reqURL)
if err != nil {
return "", fmt.Errorf("fetch file %s: %w", filepath, err)
}
@@ -118,7 +121,7 @@ func (c *Client) GetFileContent(ctx context.Context, owner, repo, filepath strin
// GetFileContentRef fetches a file from a specific ref (branch/tag/sha) in a repo.
func (c *Client) GetFileContentRef(ctx context.Context, owner, repo, filepath, ref string) (string, error) {
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/raw/%s?ref=%s", c.baseURL, owner, repo, filepath, url.QueryEscape(ref))
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/raw/%s?ref=%s", c.baseURL, owner, repo, escapePath(filepath), url.QueryEscape(ref))
body, err := c.doGet(ctx, reqURL)
if err != nil {
return "", fmt.Errorf("fetch file %s@%s: %w", filepath, ref, err)
@@ -129,7 +132,7 @@ func (c *Client) GetFileContentRef(ctx context.Context, owner, repo, filepath, r
// PostReview submits a review to a PR.
// event should be "APPROVED" or "REQUEST_CHANGES".
func (c *Client) PostReview(ctx context.Context, owner, repo string, number int, event, body string) error {
url := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d/reviews", c.baseURL, owner, repo, number)
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d/reviews", c.baseURL, owner, repo, number)
payload := struct {
Body string `json:"body"`
@@ -144,7 +147,7 @@ func (c *Client) PostReview(ctx context.Context, owner, repo string, number int,
return fmt.Errorf("marshal review payload: %w", err)
}
req, err := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader(data))
req, err := http.NewRequestWithContext(ctx, http.MethodPost, reqURL, bytes.NewReader(data))
if err != nil {
return fmt.Errorf("create review request: %w", err)
}
@@ -164,8 +167,8 @@ func (c *Client) PostReview(ctx context.Context, owner, repo string, number int,
return nil
}
func (c *Client) doGet(ctx context.Context, url string) ([]byte, error) {
req, err := http.NewRequestWithContext(ctx, "GET", url, nil)
func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
req, err := http.NewRequestWithContext(ctx, http.MethodGet, reqURL, nil)
if err != nil {
return nil, err
}
@@ -184,6 +187,18 @@ func (c *Client) doGet(ctx context.Context, url string) ([]byte, error) {
return io.ReadAll(resp.Body)
}
// escapePath escapes each segment of a relative file path for use in URLs.
// Slashes are preserved as path separators; other special characters are escaped.
// Input should be a relative path (no leading slash). Already-encoded segments
// will be double-encoded, which is the desired behavior for user-provided paths.
func escapePath(p string) string {
parts := strings.Split(p, "/")
for i, part := range parts {
parts[i] = url.PathEscape(part)
}
return strings.Join(parts, "/")
}
// ContentEntry represents a file or directory entry from the contents API.
type ContentEntry struct {
Name string `json:"name"`
@@ -192,9 +207,15 @@ type ContentEntry struct {
}
// ListContents lists files and directories at a given path in a repo.
// Pass an empty path to list the repository root.
func (c *Client) ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error) {
url := fmt.Sprintf("%s/api/v1/repos/%s/%s/contents/%s", c.baseURL, owner, repo, path)
body, err := c.doGet(ctx, url)
var reqURL string
if path == "" {
reqURL = fmt.Sprintf("%s/api/v1/repos/%s/%s/contents", c.baseURL, owner, repo)
} else {
reqURL = fmt.Sprintf("%s/api/v1/repos/%s/%s/contents/%s", c.baseURL, owner, repo, escapePath(path))
}
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("list contents %s: %w", path, err)
}
+24
View File
@@ -294,3 +294,27 @@ func TestGetAllFilesInPath_File(t *testing.T) {
t.Errorf("unexpected content: %q", files["README.md"])
}
}
func TestEscapePath(t *testing.T) {
tests := []struct {
name string
input string
want string
}{
{"simple", "src/main.go", "src/main.go"},
{"spaces", "my dir/my file.go", "my%20dir/my%20file.go"},
{"special chars", "path/file#1.txt", "path/file%231.txt"},
{"empty", "", ""},
{"single segment", "README.md", "README.md"},
{"nested deep", "a/b/c/d.md", "a/b/c/d.md"},
{"already encoded", "path/file%20name.go", "path/file%2520name.go"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := escapePath(tt.input)
if got != tt.want {
t.Errorf("escapePath(%q) = %q, want %q", tt.input, got, tt.want)
}
})
}
}
+1
View File
@@ -1,3 +1,4 @@
// Package llm provides a client for OpenAI-compatible chat completion APIs.
package llm
import (
+28 -4
View File
@@ -1,3 +1,5 @@
// Package review builds prompts for AI code review and parses LLM responses
// into structured review results.
package review
import (
@@ -5,8 +7,10 @@ import (
"strings"
)
// BuildSystemPrompt constructs the system prompt for the LLM reviewer.
func BuildSystemPrompt(conventions, patterns string) string {
// BuildSystemBase returns the core system prompt instructions without
// patterns or conventions. Used by the budget package to separate
// trimmable from non-trimmable content.
func BuildSystemBase() string {
var sb strings.Builder
sb.WriteString("You are an expert code reviewer. Review the provided pull request diff carefully.\n\n")
@@ -40,6 +44,15 @@ func BuildSystemPrompt(conventions, patterns string) string {
sb.WriteString("- Line numbers should reference the new file line numbers from the diff headers.\n")
sb.WriteString("- If the diff is empty or trivial (only formatting/whitespace), APPROVE with no findings.\n")
return sb.String()
}
// BuildSystemPrompt constructs the full system prompt with patterns and conventions.
// Deprecated: Use BuildSystemBase with budget.Fit for context-aware assembly.
func BuildSystemPrompt(conventions, patterns string) string {
var sb strings.Builder
sb.WriteString(BuildSystemBase())
if patterns != "" {
sb.WriteString(fmt.Sprintf("\n\n## Language Patterns & Idioms\n\nUse the following patterns as review criteria. Code that violates these established patterns is a finding:\n\n%s\n", patterns))
}
@@ -51,8 +64,9 @@ func BuildSystemPrompt(conventions, patterns string) string {
return sb.String()
}
// BuildUserPrompt constructs the user message with PR context.
func BuildUserPrompt(title, description, diff, fileContext string, ciPassed bool, ciDetails string) string {
// BuildUserMeta returns the PR metadata header (title, description, CI status)
// without the diff or file context. Used by the budget package.
func BuildUserMeta(title, description string, ciPassed bool, ciDetails string) string {
var sb strings.Builder
sb.WriteString(fmt.Sprintf("## Pull Request: %s\n\n", title))
@@ -71,6 +85,16 @@ func BuildUserPrompt(title, description, diff, fileContext string, ciPassed bool
sb.WriteString(fmt.Sprintf("CI Details: %s\n", ciDetails))
}
return sb.String()
}
// BuildUserPrompt constructs the user message with PR context.
// Deprecated: Use BuildUserMeta with budget.Fit for context-aware assembly.
func BuildUserPrompt(title, description, diff, fileContext string, ciPassed bool, ciDetails string) string {
var sb strings.Builder
sb.WriteString(BuildUserMeta(title, description, ciPassed, ciDetails))
if fileContext != "" {
sb.WriteString("\n### Full File Context (modified files)\n\n")
sb.WriteString(fileContext)
+40
View File
@@ -116,3 +116,43 @@ func TestBuildUserPrompt_WithoutFileContext(t *testing.T) {
t.Error("should not include file context section when empty")
}
}
func TestBuildSystemBase(t *testing.T) {
result := BuildSystemBase()
if result == "" {
t.Fatal("BuildSystemBase returned empty string")
}
if !strings.Contains(result, "expert code reviewer") {
t.Error("expected reviewer role in system base")
}
if !strings.Contains(result, "REQUEST_CHANGES") {
t.Error("expected verdict format in system base")
}
if !strings.Contains(result, "JSON") {
t.Error("expected JSON output instruction in system base")
}
}
func TestBuildUserMeta(t *testing.T) {
result := BuildUserMeta("Fix bug", "Some description", true, "all checks passed")
if !strings.Contains(result, "Fix bug") {
t.Error("expected title in user meta")
}
if !strings.Contains(result, "Some description") {
t.Error("expected description in user meta")
}
if !strings.Contains(result, "PASSED") {
t.Error("expected CI PASSED status")
}
}
func TestBuildUserMeta_CIFailed(t *testing.T) {
result := BuildUserMeta("Title", "", false, "test job failed")
if !strings.Contains(result, "FAILED") {
t.Error("expected CI FAILED status")
}
if strings.Contains(result, "Description") {
t.Error("expected no description section when empty")
}
}