feat: load personas from target repo .review-bot/personas/ #61
Reference in New Issue
Block a user
Delete Branch "issue-60"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Implements #60: Load personas from target repo
.review-bot/personas/.Changes
review/repo_persona.go (new file)
GiteaClientinterface for fetching repo filesContentEntrytype mirroring gitea.ContentEntryLoadRepoPersonas()to fetch and parse personas from repoMergePersonas()to combine built-in and repo personas (repo takes precedence)GetBuiltinPersonasMap()helper for loading all built-ins as mapreview/persona.go
ParsePersonaBytes()for parsing personas from byte data (for external sources)cmd/review-bot/main.go
--personais specified, check repo personas first, fall back to built-ingiteaClientAdapterto bridge gitea.Client to review.GiteaClient interfaceUsage
Repos can now add domain-specific reviewers:
Testing
ParsePersonaBytesLoadRepoPersonas(directory not found, empty, valid personas, invalid files skipped, auth errors propagated)MergePersonas(override behavior, original maps not modified)GetBuiltinPersonasMapisYAMLFileandisNotFoundErrorgo vet ./...passesscripts/check-deps.shpassesCloses #60
c1c8e61e9dto3f06ba2ea6Self-review: PASS
No issues found — ready for human review.
Review notes:
Coherence: ✅ Single clear purpose (load personas from target repo). Scope is right — adds types, loading logic, integration, and tests. No unrelated changes.
Fit: ✅
slog.Warn/slog.Debug/slog.Infopatternsfmt.Errorfwith%wwhere appropriate, log-and-skip for non-fatal errors)Solution: ✅
APIError.Error()producesCompleteness: ✅
.review-bot/personas/*.yamlfrom target repo via Gitea APIgo test ./...passesgo vet ./...passesscripts/check-deps.shpassesOriginal reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
3f06ba2e)Sonnet Review
Summary
Clean, well-structured implementation of repo persona loading. The code follows established patterns correctly, has thorough test coverage, and CI passes. Two minor issues worth noting: a duplicate
flag.Parse()call and a subtle nil-map access risk whenLoadRepoPersonasreturns an error.Findings
cmd/review-bot/main.goflag.Parse()call. The file callsflag.Parse()twice (lines ~100 and ~103 in the full file). The second call is a no-op but is dead code that should be removed.cmd/review-bot/main.goLoadRepoPersonasreturns an error,repoPersonaswill be nil (not an empty map), but the subsequentrepoPersonas[*personaName]map index on line ~187 is safe in Go (indexing a nil map returns the zero value, false). However, theerrvariable fromLoadRepoPersonasis shadowed by the laterpersona, err = review.LoadBuiltinPersona(...)assignment at line ~193, which is correct since the outererris already declared. The logic is sound but slightly subtle — consider using a blank identifier or a named variablerepoErrto make the intent clearer.review/repo_persona.goisNotFoundErroruses a string-based heuristic (strings.Contains(errStr, "not found")) that could produce false positives for errors like "user not found" from application logic. TheHTTP 404check is precise; the barenot foundcheck is looser. This is acknowledged in the comment but worth considering whether to tighten to only theHTTP 404pattern.review/repo_persona_test.goTestParsePersonaBytestable struct has inconsistent field alignment (extra blank line betweensourceandwantNamefields in some entries). This is cosmetic andgofmtshould handle it, but it's visible in the raw source.Recommendation
APPROVE — Approve with the recommendation to fix the duplicate
flag.Parse()call before or shortly after merge — it's dead code that could confuse future readers. The nil-map access on an erroredrepoPersonasreturn is safe in Go but the error shadowing pattern is slightly subtle. TheisNotFoundErrorheuristic is documented and acceptable given the import-cycle constraint preventing use of a typed error. Overall, this is a well-designed feature: proper interface abstraction to avoid import cycles, thorough test coverage following table-driven patterns, correct fallback behavior, and appropriate error propagation semantics.Review by sonnet
Evaluated against
3f06ba2e@@ -199,0 +182,4 @@if err != nil {slog.Warn("could not load repo personas", "repo", owner+"/"+repoName, "error", err)// Continue with built-in personas only}[MINOR] When
LoadRepoPersonasreturns an error,repoPersonaswill be nil (not an empty map), but the subsequentrepoPersonas[*personaName]map index on line ~187 is safe in Go (indexing a nil map returns the zero value, false). However, theerrvariable fromLoadRepoPersonasis shadowed by the laterpersona, err = review.LoadBuiltinPersona(...)assignment at line ~193, which is correct since the outererris already declared. The logic is sound but slightly subtle — consider using a blank identifier or a named variablerepoErrto make the intent clearer.@@ -0,0 +130,4 @@// isNotFoundError checks if an error represents a 404 response.// This is a string-based heuristic since we don't have access to gitea.APIError here.func isNotFoundError(err error) bool {if err == nil {[NIT]
isNotFoundErroruses a string-based heuristic (strings.Contains(errStr, "not found")) that could produce false positives for errors like "user not found" from application logic. TheHTTP 404check is precise; the barenot foundcheck is looser. This is acknowledged in the comment but worth considering whether to tighten to only theHTTP 404pattern.@@ -0,0 +15,4 @@wantName stringwantErr string}{{[NIT] Minor formatting inconsistency: the
TestParsePersonaBytestable struct has inconsistent field alignment (extra blank line betweensourceandwantNamefields in some entries). This is cosmetic andgofmtshould handle it, but it's visible in the raw source.Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
3f06ba2e)Gpt Review
Summary
The changes cleanly add support for loading repo-specific personas and correctly defer persona selection until after Gitea initialization. Implementation is idiomatic, well-tested, and CI passes.
Findings
review/repo_persona.goreview/repo_persona.goreview/persona.goRecommendation
APPROVE — Overall the PR is solid: it defers persona loading to enable repo-based personas, adds a clean adapter for the Gitea client, and includes comprehensive tests. CI has passed. Consider adding a content size check for repo persona files in LoadRepoPersonas (similar to MaxPersonaFileSize for local files) and slightly tightening the isNotFoundError heuristic. These are minor improvements and do not block approval.
Review by gpt
Evaluated against
3f06ba2e@@ -227,0 +228,4 @@// This is useful for parsing personas fetched from external sources (e.g., Gitea API)// without requiring filesystem access. Format is detected by source extension.func ParsePersonaBytes(data []byte, source string) (*Persona, error) {return parsePersona(data, source)[NIT] ParsePersonaBytes delegates to parsePersona, which is fine, but does not apply any size checks like LoadPersona. Consider documenting that size limits are only enforced for filesystem sources, or add a size check for external sources if appropriate.
@@ -0,0 +61,4 @@slog.Warn("could not fetch repo persona file","file", entry.Path,"repo", owner+"/"+repo,"error", err)[MINOR] LoadRepoPersonas parses repo personas via ParsePersonaBytes without enforcing MaxPersonaFileSize. A very large YAML file with few nodes (e.g., a huge identity string) could still be fetched and parsed, potentially increasing memory usage. Consider checking len(content) against MaxPersonaFileSize before parsing.
@@ -0,0 +118,4 @@}result[name] = p}return result[NIT] isNotFoundError uses broad string matching ("not found") which may incorrectly classify unrelated errors as 404s. If possible, narrow the check (e.g., look for an API error type or more specific message prefix) to reduce false positives.
Security Review
Summary
Good feature addition with strict YAML parsing and tests; however, repo-loaded personas are parsed without enforcing the existing MaxPersonaFileSize limit, creating a potential resource exhaustion vector. CI is green, but this security issue must be addressed.
Findings
review/repo_persona.goreview/repo_persona.goRecommendation
REQUEST_CHANGES — Add a strict size check for repo persona content before parsing to prevent resource exhaustion. Recommended options:
Also consider tightening isNotFoundError by checking a typed error/status code from the adapter rather than substring matching. You could extend the GiteaClient interface or the adapter to surface an explicit 404 indicator so non-404 errors are propagated correctly. Once the size limit is enforced, the PR can be approved.
Review by security
Evaluated against
3f06ba2e@@ -0,0 +60,4 @@if err != nil {slog.Warn("could not fetch repo persona file","file", entry.Path,"repo", owner+"/"+repo,[MAJOR] Repo persona content is parsed without enforcing a maximum size. An attacker can commit an excessively large YAML persona (e.g., single huge scalar) to the repo, causing high memory/CPU usage when the bot fetches and parses it. Unlike LoadPersona() (filesystem), LoadRepoPersonas()/ParsePersonaBytes() do not check MaxPersonaFileSize, enabling a DoS vector.
@@ -0,0 +101,4 @@slog.Debug("repo persona overrides built-in", "name", name)}result[name] = p}[MINOR] isNotFoundError uses a broad substring match ("HTTP 404" or "not found"). This can misclassify non-404 errors that contain "not found" (e.g., upstream message wording) and silently skip persona loading. While not directly exploitable, it can mask auth or transport errors.
Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
3f06ba2e)Sonnet Review
Summary
Clean, well-structured implementation of repo persona loading. The code follows established patterns (small interface, adapter pattern, graceful degradation), is well-tested, and CI passes. There are a few minor issues worth noting but none are blockers.
Findings
cmd/review-bot/main.goLoadRepoPersonasreturns an error (non-404), the code logs a warning and continues withrepoPersonasbeing nil (its zero value). The subsequentif p, ok := repoPersonas[*personaName]safely handles nil maps (map indexing on nil returns zero value), so this works correctly, but it's subtle. A reader might wonder ifrepoPersonascould cause a panic. Consider initializingrepoPersonasto an empty map before the error check, or adding a brief comment that nil map indexing is safe in Go.cmd/review-bot/main.goflag.Parse()is called twice (lines 78 and 79 in the full file). This is a pre-existing bug not introduced by this PR, but it's worth noting since the diff shows the surrounding context.review/repo_persona.goisNotFoundErroruses string-based heuristics ("not found"case-insensitive substring match). This could produce false positives — e.g., an error message like"operation not found in registry"from a completely different layer would be treated as a 404. The comment acknowledges this is a heuristic, which is acceptable given the import cycle constraint, but consider wrapping gitea errors with a typed sentinel or checking only for the more specific"HTTP 404"prefix to reduce false positives.review/repo_persona.goGiteaClientinterface name is somewhat generic for a package namedreview. Since this interface is exported and lives inreview, callers see it asreview.GiteaClient. A name likeRepoFileClientorContentClientwould be more descriptive of what it actually does (list/get file contents) rather than being named after a specific vendor.review/repo_persona_test.goTestParsePersonaBytestest struct definition betweenwantNameandwantErrfields. This is inconsistent with the other test structs in the file and would be flagged bygofmt.cmd/review-bot/main.gogiteaClientAdapterstruct and its constructor/methods could live in a separate file (e.g.,cmd/review-bot/adapter.go) to keepmain.gofocused, but this is a very minor organizational concern.Recommendation
APPROVE — Approve. The implementation is correct, well-designed, and thoroughly tested. The
GiteaClientinterface properly decouples thereviewpackage from thegiteapackage, the adapter pattern is idiomatic, and graceful degradation (warning on repo persona load failure, fallback to built-in) is the right behavior. TheMergePersonasfunction correctly avoids mutating its inputs. TheisNotFoundErrorheuristic is the pragmatic approach given the import cycle constraint — the comment makes the limitation clear. The doubleflag.Parse()is a pre-existing issue not introduced here. None of the findings are blockers.Review by sonnet
Evaluated against
3f06ba2e@@ -199,0 +180,4 @@// Try loading from repo first, then fall back to built-inrepoPersonas, err := review.LoadRepoPersonas(ctx, newGiteaClientAdapter(giteaClient), owner, repoName)if err != nil {slog.Warn("could not load repo personas", "repo", owner+"/"+repoName, "error", err)[MINOR] When
LoadRepoPersonasreturns an error (non-404), the code logs a warning and continues withrepoPersonasbeing nil (its zero value). The subsequentif p, ok := repoPersonas[*personaName]safely handles nil maps (map indexing on nil returns zero value), so this works correctly, but it's subtle. A reader might wonder ifrepoPersonascould cause a panic. Consider initializingrepoPersonasto an empty map before the error check, or adding a brief comment that nil map indexing is safe in Go.@@ -786,0 +822,4 @@return result, nil}func (a *giteaClientAdapter) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {[NIT] The
giteaClientAdapterstruct and its constructor/methods could live in a separate file (e.g.,cmd/review-bot/adapter.go) to keepmain.gofocused, but this is a very minor organizational concern.@@ -0,0 +10,4 @@const RepoPersonaPath = ".review-bot/personas"// GiteaClient defines the subset of gitea.Client methods needed for loading repo personas.// This interface allows for easier testing and decouples the review package from gitea.[NIT] The
GiteaClientinterface name is somewhat generic for a package namedreview. Since this interface is exported and lives inreview, callers see it asreview.GiteaClient. A name likeRepoFileClientorContentClientwould be more descriptive of what it actually does (list/get file contents) rather than being named after a specific vendor.@@ -0,0 +125,4 @@func isYAMLFile(name string) bool {lower := strings.ToLower(name)return strings.HasSuffix(lower, ".yaml") || strings.HasSuffix(lower, ".yml")}[MINOR]
isNotFoundErroruses string-based heuristics ("not found"case-insensitive substring match). This could produce false positives — e.g., an error message like"operation not found in registry"from a completely different layer would be treated as a 404. The comment acknowledges this is a heuristic, which is acceptable given the import cycle constraint, but consider wrapping gitea errors with a typed sentinel or checking only for the more specific"HTTP 404"prefix to reduce false positives.@@ -0,0 +10,4 @@func TestParsePersonaBytes(t *testing.T) {tests := []struct {name stringdata string[NIT] Minor style: extra blank line in the
TestParsePersonaBytestest struct definition betweenwantNameandwantErrfields. This is inconsistent with the other test structs in the file and would be flagged bygofmt.Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
3f06ba2e)Gpt Review
Summary
Solid implementation of repo-scoped personas with a clean adapter, sensible fallback to built-ins, and comprehensive tests. CI passed and the changes follow idiomatic Go patterns with appropriate error handling and limits for YAML parsing.
Findings
review/repo_persona.goreview/repo_persona.goRecommendation
APPROVE — The PR is well-structured and adheres to idiomatic Go. The new repo persona loading path is safe (size/depth limits are enforced for YAML), gracefully falls back to built-ins, and the adapter avoids import cycles. Tests cover key scenarios. Consider hardening 404 detection by using typed errors or normalizing via the adapter, and evaluate whether logging inside review/repo_persona.go should be delegated to callers. With these minor considerations, the changes are good to merge.
Review by gpt
Evaluated against
3f06ba2e@@ -0,0 +48,4 @@}for _, entry := range entries {if entry.Type != "file" {[NIT] LoadRepoPersonas logs warnings for per-file fetch/parse failures inside the library package. Consider returning these details (e.g., as a slice of skipped files) and letting callers decide how to log to reduce potential log noise from dependencies.
@@ -0,0 +106,4 @@return result}// GetBuiltinPersonasMap returns all built-in personas as a map keyed by name.[MINOR] isNotFoundError relies on substring matching of error messages ("HTTP 404" or "not found"), which is brittle. Prefer a typed error with errors.Is or have the adapter normalize 404s to a sentinel error where possible.
Security Review
Summary
The feature is implemented cleanly and CI has passed, but there is a denial-of-service risk when loading repo personas: content size is unbounded and all files in the personas directory are processed without limits. These pose resource exhaustion vectors.
Findings
review/repo_persona.goreview/repo_persona.goRecommendation
REQUEST_CHANGES — Introduce a size check for repo persona content to mitigate memory exhaustion. After fetching content via GetFileContent, reject and skip parsing if len(content) exceeds MaxPersonaFileSize (reuse the existing MaxPersonaFileSize constant). Alternatively, extend the ContentEntry to include file sizes (if available from the API) and pre-filter large files before fetching. Also, consider capping the number of YAML files processed from .review-bot/personas (e.g., process only the first N files, or require specific naming) to limit network and parsing work. These changes will align repo persona loading with the existing safeguards in LoadPersona and reduce denial-of-service risk while preserving functionality.
Review by security
Evaluated against
3f06ba2e@@ -0,0 +42,4 @@return nil, err}if len(entries) == 0 {[MINOR] Unbounded number of files processed: the loader iterates over all entries in .review-bot/personas and fetches each YAML file. A repository could include a very large number of small files to cause many network requests and parses, leading to CPU and memory pressure. Consider imposing a reasonable cap on the number of persona files processed.
@@ -0,0 +51,4 @@if entry.Type != "file" {continue}// Only process YAML files[MAJOR] Unbounded repo persona file size: content fetched from the repository is read into memory without any size limit before parsing. An attacker can commit an extremely large YAML file under .review-bot/personas to cause excessive memory usage or OOM. Unlike LoadPersona (which enforces MaxPersonaFileSize), repo personas have no such cap.
Review Findings Addressed
Pushed commit
9775cb0addressing all review findings:MAJOR (fixed):
review/repo_persona.goline 63 — AddedMaxPersonaFileSizecheck inLoadRepoPersonas()before callingParsePersonaBytes(). Oversized files are now logged and skipped, preventing resource exhaustion from malicious large YAML files.MINOR (fixed):
review/repo_persona.goisNotFoundError— Tightened substring match from"HTTP 404" || "not found"to just"HTTP 404". This prevents auth/transport errors containing generic "not found" text from being misidentified as 404s.cmd/review-bot/main.go— Removed duplicateflag.Parse()call.cmd/review-bot/main.goline 183 — Added comment explaining that nil map indexing is safe in Go (returns zero value), so the fallback to built-in personas works correctly whenLoadRepoPersonaserrors.Tests:
TestIsNotFoundErrorto reflect the intentional behavior change (generic "not found" → false)skips oversized filesto verify the size limit enforcementAll tests pass:
go test ./...✅No issues:
go vet ./...✅Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
1dd73bc4)Sonnet Review
Summary
This PR cleanly implements repo-level persona loading with good separation of concerns, comprehensive tests, and correct error handling. The code follows established patterns in the codebase and the CI passes.
Findings
cmd/review-bot/main.goflag.Parse()call: the diff shows the originalflag.Parse()was removed and a new one kept, but the full file content showsflag.Parse()appears only once. This appears to be a diff artifact (the-removed one and the context line shows the kept one). Verify the compiled binary doesn't callflag.Parse()twice — if it does, this should be fixed as double-parsing has undefined behavior for some flag types.review/repo_persona.goisNotFoundErroruses string-substring matching on error messages (strings.Contains(err.Error(), "HTTP 404")). This is fragile and couples the review package to the specific error format produced by the gitea package. A more robust approach would be to define a sentinel error type (e.g.,ErrNotFound) in the gitea package and useerrors.Ishere, or to have theGiteaClientinterface return a typed error. The current approach is documented and tested, so it's functional, but it's a design smell.cmd/review-bot/main.goLoadRepoPersonasreturns an error,repoPersonasis nil. The comment correctly notes that map indexing on a nil map is safe in Go (returns zero value). However,errfromLoadRepoPersonasis shadowed — the outervar persona *review.Personablock has noerrdeclaration, so theerrused inpersona, err = review.LoadBuiltinPersona(...)on ~line 196 is an undeclared variable in the else branch. Check that theerrvariable is properly scoped here — it may need a short variable declaration (:=) in the else branch.review/repo_persona_test.goTestParsePersonaByteshas extra blank lines in field alignment (name,data,source,wantName,wantErr). This doesn't affect correctness but is inconsistent with Go'sgofmtstyle.review/repo_persona.goMergePersonasis defined but not used inmain.go— the main flow loads repo personas and then falls back to built-in individually, rather than merging all of them first.MergePersonasis a useful utility and is tested, but consider whether it should be used in main (load all built-ins + all repo personas into a merged map, then look up) for consistency and simplicity. As-is, the main flow only fetches repo personas on every invocation when--personais specified, even though the built-in fallback path doesn't benefit from pre-loading all built-ins.Recommendation
APPROVE — The PR is well-structured, tests are comprehensive and cover edge cases (auth errors, oversized files, invalid YAML, skipped non-YAML files), and CI passes. The main concern to double-check before merging is the potential
errvariable scoping issue in the persona loading block inmain.go(theelsebranch useserrfromLoadBuiltinPersonabut there's novar err errordeclaration visible in that scope after theif err != nilcheck onLoadRepoPersonas). If the code compiles cleanly (which CI confirms it does), this is safe to approve. TheisNotFoundErrorstring-matching approach is a minor design concern worth a follow-up issue but not a blocker.Review by sonnet
Evaluated against
1dd73bc4@@ -81,4 +81,3 @@flag.Parse()flag.Parse()[MINOR] Duplicate
flag.Parse()call: the diff shows the originalflag.Parse()was removed and a new one kept, but the full file content showsflag.Parse()appears only once. This appears to be a diff artifact (the-removed one and the context line shows the kept one). Verify the compiled binary doesn't callflag.Parse()twice — if it does, this should be fixed as double-parsing has undefined behavior for some flag types.@@ -199,0 +180,4 @@repoPersonas, err := review.LoadRepoPersonas(ctx, newGiteaClientAdapter(giteaClient), owner, repoName)if err != nil {slog.Warn("could not load repo personas", "repo", owner+"/"+repoName, "error", err)// Continue with built-in personas only.[MINOR] When
LoadRepoPersonasreturns an error,repoPersonasis nil. The comment correctly notes that map indexing on a nil map is safe in Go (returns zero value). However,errfromLoadRepoPersonasis shadowed — the outervar persona *review.Personablock has noerrdeclaration, so theerrused inpersona, err = review.LoadBuiltinPersona(...)on ~line 196 is an undeclared variable in the else branch. Check that theerrvariable is properly scoped here — it may need a short variable declaration (:=) in the else branch.@@ -0,0 +77,4 @@}persona, err := ParsePersonaBytes([]byte(content), entry.Path)if err != nil {[NIT]
MergePersonasis defined but not used inmain.go— the main flow loads repo personas and then falls back to built-in individually, rather than merging all of them first.MergePersonasis a useful utility and is tested, but consider whether it should be used in main (load all built-ins + all repo personas into a merged map, then look up) for consistency and simplicity. As-is, the main flow only fetches repo personas on every invocation when--personais specified, even though the built-in fallback path doesn't benefit from pre-loading all built-ins.@@ -0,0 +127,4 @@slog.Warn("could not load built-in persona", "name", name, "error", err)continue}result[name] = p[MINOR]
isNotFoundErroruses string-substring matching on error messages (strings.Contains(err.Error(), "HTTP 404")). This is fragile and couples the review package to the specific error format produced by the gitea package. A more robust approach would be to define a sentinel error type (e.g.,ErrNotFound) in the gitea package and useerrors.Ishere, or to have theGiteaClientinterface return a typed error. The current approach is documented and tested, so it's functional, but it's a design smell.@@ -0,0 +13,4 @@data stringsource stringwantName stringwantErr string[NIT] Minor style nit: the struct literal in
TestParsePersonaByteshas extra blank lines in field alignment (name,data,source,wantName,wantErr). This doesn't affect correctness but is inconsistent with Go'sgofmtstyle.Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
1dd73bc4)Gpt Review
Summary
Solid addition of repo-scoped personas with a clean, small interface and thorough tests. The main flow correctly prioritizes repo personas with a safe fallback to built-ins, and the adapter pattern keeps review decoupled from gitea.
Findings
review/repo_persona.goreview/repo_persona.gocmd/review-bot/main.goRecommendation
APPROVE — The changes are well-structured and adhere to idiomatic Go patterns: a small interface for Gitea access, clear separation of responsibilities, and comprehensive tests. The main flow safely loads repo personas first with a built-in fallback. Consider making the 404 detection more robust and optionally supporting JSON files in the repo personas to avoid surprising users. Otherwise, this is ready to merge.
Review by gpt
Evaluated against
1dd73bc4@@ -196,6 +173,43 @@ func main() {ctx, cancel := context.WithTimeout(context.Background(), overallTimeout)[NIT] When LoadRepoPersonas returns an error, the code relies on nil-map indexing semantics for the fallback. While correct, initializing repoPersonas to an empty map on error may improve readability.
@@ -0,0 +107,4 @@}// Overlay repo personas (override on collision)for name, p := range repo {[MINOR] LoadRepoPersonas filters to YAML files only via isYAMLFile, but ParsePersonaBytes supports JSON based on extension. Consider accepting .json in repo personas for consistency and user convenience.
@@ -0,0 +113,4 @@}result[name] = p}[MINOR] isNotFoundError relies on substring matching "HTTP 404" in error messages, which is brittle and coupled to the client's error text. Consider a more robust approach (typed errors or explicit status code) if possible.
Sonnet Review
Summary
Clean, well-tested implementation that correctly loads personas from a target repo's
.review-bot/personas/directory with appropriate fallback to built-ins. CI passes, code follows established patterns, and edge cases (auth errors, oversized files, invalid YAML, 404) are handled correctly.Findings
cmd/review-bot/main.goflag.Parse()call — the diff shows the oldflag.Parse()was removed but the remaining one on line 81 is a leftover duplicate. The full file content showsflag.Parse()appears twice consecutively (lines 80-81). This is a copy-paste artifact from the diff. It is harmless since repeatedflag.Parse()calls are idempotent, but it's noise and should be cleaned up.review/repo_persona.goisNotFoundErrorrelies on string matching (strings.Contains(err.Error(), "HTTP 404")) against the gitea client's error format. This is a brittle coupling: if the gitea package ever changes its error message format, this silently breaks (directory-not-found becomes a propagated error). A sentinel error or a typed error in the gitea package would be more robust, but given this is an internal package with a defined error format and the function is well-documented with the rationale, this is acceptable in context.cmd/review-bot/main.goLoadRepoPersonasreturns an error,repoPersonasisniland the code falls through torepoPersonas[*personaName]. The comment correctly notes this is safe in Go (nil map read returns zero value), but the variableerris declared viarepoPersonas, err := ...and then reassigned in theelsebranch withpersona, err = review.LoadBuiltinPersona(...). This works due to Go's scoping rules but the comment about nil map safety would be clearer if placed directly above the map indexing line rather than after the Warn log.review/repo_persona_test.gonameandstringfield definitions). Minor formatting issue —gofmtwould normalize this.review/repo_persona.goMergePersonasis defined but never called inmain.go— the main flow only callsLoadRepoPersonasand does a direct map lookup.MergePersonasandGetBuiltinPersonasMapare exported utility functions that are tested but unused by the current main flow. This is fine as they're part of the public API for future use, but worth noting in a follow-up if they remain unused.Recommendation
APPROVE — Approve with minor cleanup. The implementation is correct, well-structured, and follows established patterns. The duplicate
flag.Parse()is the most notable artifact and should be removed in a follow-up commit, but it is harmless. The interface-based decoupling (GiteaClient/giteaClientAdapter) is idiomatic and enables the thorough testing visible inrepo_persona_test.go. Error handling is conservative (propagate auth errors, skip individual parse failures), which is the right call for a best-effort feature like loading optional repo personas.Review by sonnet
Evaluated against
1dd73bc4@@ -79,7 +79,6 @@ func main() {aicoreAPIURL := flag.String("aicore-api-url", envOrDefault("AICORE_API_URL", ""), "SAP AI Core API URL (for provider=aicore)")aicoreResourceGroup := flag.String("aicore-resource-group", envOrDefault("AICORE_RESOURCE_GROUP", "default"), "SAP AI Core resource group (for provider=aicore)")[MINOR] Duplicate
flag.Parse()call — the diff shows the oldflag.Parse()was removed but the remaining one on line 81 is a leftover duplicate. The full file content showsflag.Parse()appears twice consecutively (lines 80-81). This is a copy-paste artifact from the diff. It is harmless since repeatedflag.Parse()calls are idempotent, but it's noise and should be cleaned up.@@ -199,0 +182,4 @@slog.Warn("could not load repo personas", "repo", owner+"/"+repoName, "error", err)// Continue with built-in personas only.// NOTE: repoPersonas is nil here, but map indexing on a nil map is safe in Go// (returns the zero value), so the fallback to built-in below works correctly.[MINOR] When
LoadRepoPersonasreturns an error,repoPersonasisniland the code falls through torepoPersonas[*personaName]. The comment correctly notes this is safe in Go (nil map read returns zero value), but the variableerris declared viarepoPersonas, err := ...and then reassigned in theelsebranch withpersona, err = review.LoadBuiltinPersona(...). This works due to Go's scoping rules but the comment about nil map safety would be clearer if placed directly above the map indexing line rather than after the Warn log.@@ -0,0 +83,4 @@"repo", owner+"/"+repo,"error", err)continue}[NIT]
MergePersonasis defined but never called inmain.go— the main flow only callsLoadRepoPersonasand does a direct map lookup.MergePersonasandGetBuiltinPersonasMapare exported utility functions that are tested but unused by the current main flow. This is fine as they're part of the public API for future use, but worth noting in a follow-up if they remain unused.@@ -0,0 +144,4 @@// contain "not found" in their message.func isNotFoundError(err error) bool {if err == nil {return false[MINOR]
isNotFoundErrorrelies on string matching (strings.Contains(err.Error(), "HTTP 404")) against the gitea client's error format. This is a brittle coupling: if the gitea package ever changes its error message format, this silently breaks (directory-not-found becomes a propagated error). A sentinel error or a typed error in the gitea package would be more robust, but given this is an internal package with a defined error format and the function is well-documented with the rationale, this is acceptable in context.@@ -0,0 +16,4 @@wantErr string}{{name: "valid yaml",[NIT] Trailing whitespace on the struct field alignment lines (extra spaces between
nameandstringfield definitions). Minor formatting issue —gofmtwould normalize this.Security Review
Summary
Implements repository-provided personas with careful parsing and size/depth limits, plus safe fallback to built-ins. No exploitable vulnerabilities observed, and CI passed.
Findings
review/repo_persona.goreview/repo_persona.goRecommendation
APPROVE — Overall the changes are sound and security-conscious: repo personas are parsed with strict YAML settings, bounded depth/node counts, and size limits; untrusted inputs are not executed; and errors are handled conservatively. To harden further against resource exhaustion, add limits on the count of persona files processed and a total bytes processed budget, and consider leveraging structured error/status codes instead of substring matching in isNotFoundError. With CI passing and no major issues identified, this is approved.
Review by security
Evaluated against
1dd73bc4@@ -0,0 +67,4 @@}// Enforce size limit before parsing to prevent resource exhaustionif len(content) > MaxPersonaFileSize {[MINOR] Persona file size is validated only after fetching full content from the server. An attacker with write access to the default branch could commit very large YAML files and cause increased memory usage when GetFileContent returns them. Consider adding a cap on the number of files processed and/or a total bytes budget, and ensure server/client-side limits or streaming are used where possible.
@@ -0,0 +142,4 @@// This uses a specific "HTTP 404" substring match rather than a generic "not found"// match to avoid masking authentication failures or transport errors that might// contain "not found" in their message.func isNotFoundError(err error) bool {[MINOR] isNotFoundError relies on substring matching ("HTTP 404"). This is brittle and could misclassify errors if messages change or include similar text, potentially masking non-404 conditions. Prefer checking a structured status code from the client (if available) or a typed error to avoid misclassification.
Gpt Review
Summary
Well-scoped feature addition that loads repo-specific personas with a clean interface adapter and robust parsing. Tests cover the new functionality comprehensively, and CI is passing.
Findings
review/repo_persona.goRecommendation
APPROVE — The implementation cleanly introduces repo persona loading with a minimal GiteaClient interface, safe size limits, and strict YAML parsing. The main flow correctly prioritizes repo personas with a fallback to built-ins, and the adapter keeps the review package decoupled. Tests are comprehensive and follow table-driven patterns. Given CI passed and no major issues found, proceed with merge. Optionally, consider making isNotFoundError more robust if the underlying client exposes typed or structured errors.
Review by gpt
Evaluated against
1dd73bc4@@ -0,0 +138,4 @@return strings.HasSuffix(lower, ".yaml") || strings.HasSuffix(lower, ".yml")}// isNotFoundError checks if an error represents a 404 response.[MINOR] isNotFoundError relies on substring matching "HTTP 404", which may be brittle if the client error format changes. Consider checking a structured status code on the error (if available) or broadening the match without conflating auth/transport errors.
Security Review
Summary
Repo persona loading is implemented with sensible size/depth limits, strict YAML parsing, and contextual error handling. No exploitable vulnerabilities are evident; CI passed and the changes align with security best practices.
Findings
review/repo_persona.goRecommendation
APPROVE — Overall, the feature is implemented securely: YAML input is size-limited, depth- and node-count-limited, strict about unknown fields, and multi-document YAML is rejected. Persona selection falls back safely to built-ins and errors are logged without exposing secrets. As a defense-in-depth improvement, consider limiting the number of repo persona files processed or optimizing to fetch only the requested persona by filename when --persona is specified, to reduce potential resource exhaustion from a directory with many small files. No changes are required for correctness or security, so this can be merged as-is.
Review by security
Evaluated against
1dd73bc4@@ -0,0 +48,4 @@return result, nil}for _, entry := range entries {[MINOR] LoadRepoPersonas iterates and fetches every YAML file under .review-bot/personas without a per-repo cap. A malicious repo could place a very large number of small YAML files to increase API calls and parsing, causing resource usage spikes within the time budget. Consider short-circuiting when a specific persona is requested (e.g., try .yaml/.yml first) or enforcing a reasonable maximum number of files processed.