Revert "ci: disable setup-go cache (cache server unreachable)"

This reverts commit 8f564ea4f8.
ci: disable setup-go cache (cache server unreachable)
2026-05-10 19:44:08 -07:00 · 2026-05-10 19:43:46 -07:00 · 2026-05-10 19:29:13 -07:00 · 2026-05-10 19:05:37 -07:00 · 2026-05-11 01:39:43 +00:00 · 2026-05-10 17:53:42 -07:00
18 changed files with 1754 additions and 203 deletions
@@ -7,18 +7,22 @@

 ### Approved Third-Party Packages

-| Package | Use Case |
-|---------|----------|
-| `gopkg.in/yaml.v3` | YAML parsing (persona files, config) |
-| `github.com/google/go-cmp` | Test comparisons (`cmp.Diff`) |
+| Package | Use Case | Scope |
+|---------|----------|-------|
+| `gopkg.in/yaml.v3` | YAML parsing (persona files, config) | production |
+| `github.com/google/go-cmp` | Test comparisons (`cmp.Diff`) | test only |

 **Any import not in this table or the Go standard library is forbidden.**

+Transitive dependencies of approved packages are automatically allowed.
+
 To request a new dependency:
-1. Open a PR that ONLY updates this table with justification
+1. Open a PR that ONLY updates this table
 2. Requires explicit approval from Aaron
 3. After merge, a separate PR may use the package

+*Enforcement: `scripts/check-deps.sh` parses this table — update only here.*
+
 ## Error Handling

 - Return errors; never panic.
@@ -1,4 +1,4 @@
-.PHONY: build test test-integration lint clean coverage check-deps
+.PHONY: build test test-integration lint clean coverage check-deps precommit

 build:
 	go build -o review-bot ./cmd/review-bot/
@@ -9,7 +9,7 @@ AI-powered code review bot for Gitea pull requests. Fetches diff + context, send
 - **Smart budget**: Automatically trims context to fit model token limits
 - **Idempotent reviews**: Posts new review, then cleans up stale ones (one review per bot)
 - **Custom prompts**: Load additional instructions from a file (e.g. security-focused review)
- **Zero dependencies**: Go stdlib only
+- **Minimal dependencies**: Go stdlib + `gopkg.in/yaml.v3` only

 ## Quick Start: Composite Action

@@ -208,7 +208,7 @@ AI Core handles OAuth token management and deployment discovery automatically. M
 | `patterns-files` | No | `README.md` | Files/directories to fetch from pattern repos |
 | `system-prompt-file` | No | `""` | Local file with additional system prompt instructions |
 | `persona` | No | `""` | Built-in persona name (security, architect, docs) |
-| `persona-file` | No | `""` | Path to persona JSON file with custom review focus |
+| `persona-file` | No | `""` | Path to persona file (YAML or JSON) with custom review focus |
 | `temperature` | No | `0` | LLM temperature (0 = server default) |
 | `timeout` | No | `300` | LLM request timeout in seconds |
 | `dry-run` | No | `false` | Print review to stdout instead of posting |
@@ -408,32 +408,38 @@ Each persona posts independently with its own sentinel, so reviews don't interfe

 ### Custom Personas

-Create a JSON file with your domain-specific review focus:
+Create a YAML file with your domain-specific review focus:

-```json
-// .review/personas/trading.json
-{
-  "name": "trading",
-  "display_name": "Trading Domain Expert",
-  "identity": "You are a trading systems expert reviewing code for correctness.\n\nYour expertise:\n- Order lifecycle and state machines\n- Fill handling and partial fills\n- Position tracking and P&L calculations\n- Event sourcing invariants",
-  "focus": [
-    "Order state machine correctness",
-    "Fill handling edge cases (partial, overfill)",
-    "Position and P&L calculation accuracy",
-    "Event replay determinism",
-    "Decimal precision for money"
-  ],
-  "ignore": [
-    "Code style",
-    "General performance",
-    "Documentation formatting"
-  ],
-  "severity": {
-    "major": "Bugs that cause incorrect positions, fills, or money calculations",
-    "minor": "Edge cases that could cause issues under unusual conditions",
-    "nit": "Clarity improvements for domain logic"
-  }
-}
+```yaml
+# .review/personas/trading.yaml
+name: trading
+display_name: Trading Domain Expert
+
+identity: |
+  You are a trading systems expert reviewing code for correctness.
+
+  Your expertise:
+  - Order lifecycle and state machines
+  - Fill handling and partial fills
+  - Position tracking and P&L calculations
+  - Event sourcing invariants
+
+focus:
+  - Order state machine correctness
+  - Fill handling edge cases (partial, overfill)
+  - Position and P&L calculation accuracy
+  - Event replay determinism
+  - Decimal precision for money
+
+ignore:
+  - Code style
+  - General performance
+  - Documentation formatting
+
+severity:
+  major: "Bugs that cause incorrect positions, fills, or money calculations"
+  minor: "Edge cases that could cause issues under unusual conditions"
+  nit: "Clarity improvements for domain logic"
 ```

 Use it in CI:
@@ -442,17 +448,24 @@ Use it in CI:
 - uses: rodin/review-bot/.gitea/actions/review@v1
  with:
    reviewer-name: trading
-    persona-file: .review/personas/trading.json
+    persona-file: .review/personas/trading.yaml
    ...
 ```

+YAML is the recommended format for personas because it supports:
+- Multi-line strings with `|` blocks (cleaner identity definitions)
+- Comments for documentation
+- More readable arrays and nested structures
+
+JSON is also supported for backwards compatibility—just use `.json` extension.
+

 ### Persona vs system-prompt-file

 | Feature | `persona` / `persona-file` | `system-prompt-file` |
 |---------|---------------------------|----------------------|
 | Replaces base prompt | Yes | No (appends) |
-| Structured format | Yes (JSON) | No (freeform) |
+| Structured format | Yes (YAML/JSON) | No (freeform) |
 | Focus/ignore lists | Yes | Manual |
 | Severity calibration | Yes | Manual |
 | Header display name | Yes | No |
@@ -79,7 +79,6 @@ func main() {
 	aicoreAPIURL := flag.String("aicore-api-url", envOrDefault("AICORE_API_URL", ""), "SAP AI Core API URL (for provider=aicore)")
 	aicoreResourceGroup := flag.String("aicore-resource-group", envOrDefault("AICORE_RESOURCE_GROUP", "default"), "SAP AI Core resource group (for provider=aicore)")

-	flag.Parse()
 	flag.Parse()

 	if *versionFlag {
@@ -116,29 +115,7 @@ func main() {
 		os.Exit(1)
 	}

-	// Load persona if specified
-	var persona *review.Persona
-	if *personaName != "" {
-		var err error
-		persona, err = review.LoadBuiltinPersona(*personaName)
-		if err != nil {
-			slog.Error("failed to load persona", "persona", *personaName, "error", err)
-			os.Exit(1)
-		}
-		slog.Info("loaded built-in persona", "persona", persona.Name, "display", persona.DisplayName)
-	} else if *personaFile != "" {
-		resolvedPath, err := validateWorkspacePath(*personaFile, "persona-file")
-		if err != nil {
-			slog.Error("invalid persona-file path", "error", err)
-			os.Exit(1)
-		}
-		persona, err = review.LoadPersona(resolvedPath)
-		if err != nil {
-			slog.Error("failed to load persona file", "file", *personaFile, "error", err)
-			os.Exit(1)
-		}
-		slog.Info("loaded persona from file", "file", *personaFile, "persona", persona.Name)
-	}
+	// NOTE: Persona loading deferred until after Gitea client init to support repo personas

 	// Validate reviewer-name: only safe characters allowed in sentinel
 	if err := validateReviewerName(*reviewerName); err != nil {
@@ -196,6 +173,43 @@ func main() {
 	ctx, cancel := context.WithTimeout(context.Background(), overallTimeout)
 	defer cancel()

+	// Load persona if specified (after Gitea client init to support repo personas)
+	var persona *review.Persona
+	if *personaName != "" {
+		// Try loading from repo first, then fall back to built-in
+		repoPersonas, err := review.LoadRepoPersonas(ctx, newGiteaClientAdapter(giteaClient), owner, repoName)
+		if err != nil {
+			slog.Warn("could not load repo personas", "repo", owner+"/"+repoName, "error", err)
+			// Continue with built-in personas only.
+			// NOTE: repoPersonas is nil here, but map indexing on a nil map is safe in Go
+			// (returns the zero value), so the fallback to built-in below works correctly.
+		}
+		if p, ok := repoPersonas[*personaName]; ok {
+			persona = p
+			slog.Info("loaded repo persona", "persona", persona.Name, "display", persona.DisplayName, "repo", owner+"/"+repoName)
+		} else {
+			// Fall back to built-in
+			persona, err = review.LoadBuiltinPersona(*personaName)
+			if err != nil {
+				slog.Error("failed to load persona", "persona", *personaName, "error", err)
+				os.Exit(1)
+			}
+			slog.Info("loaded built-in persona", "persona", persona.Name, "display", persona.DisplayName)
+		}
+	} else if *personaFile != "" {
+		resolvedPath, err := validateWorkspacePath(*personaFile, "persona-file")
+		if err != nil {
+			slog.Error("invalid persona-file path", "error", err)
+			os.Exit(1)
+		}
+		persona, err = review.LoadPersona(resolvedPath)
+		if err != nil {
+			slog.Error("failed to load persona file", "file", *personaFile, "error", err)
+			os.Exit(1)
+		}
+		slog.Info("loaded persona from file", "file", *personaFile, "persona", persona.Name)
+	}
+
 	slog.Info("reviewing pull request", "pr", prNumber, "repo", fmt.Sprintf("%s/%s", owner, repoName))

 	// Step 1: Fetch PR metadata
@@ -783,3 +797,32 @@ func shouldSkipStaleReview(evaluatedSHA, currentSHA string) bool {
 	}
 	return evaluatedSHA != currentSHA
 }
+
+// giteaClientAdapter adapts gitea.Client to review.GiteaClient interface.
+type giteaClientAdapter struct {
+	client *gitea.Client
+}
+
+func newGiteaClientAdapter(c *gitea.Client) *giteaClientAdapter {
+	return &giteaClientAdapter{client: c}
+}
+
+func (a *giteaClientAdapter) ListContents(ctx context.Context, owner, repo, path string) ([]review.ContentEntry, error) {
+	entries, err := a.client.ListContents(ctx, owner, repo, path)
+	if err != nil {
+		return nil, err
+	}
+	result := make([]review.ContentEntry, len(entries))
+	for i, e := range entries {
+		result[i] = review.ContentEntry{
+			Name: e.Name,
+			Path: e.Path,
+			Type: e.Type,
+		}
+	}
+	return result, nil
+}
+
+func (a *giteaClientAdapter) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
+	return a.client.GetFileContent(ctx, owner, repo, filepath)
+}
@@ -0,0 +1,108 @@
+# Design: YAML Support for Persona Files (#57)
+
+## Problem
+
+JSON is awkward for persona files that contain multi-line text (identity, severity descriptions). YAML supports cleaner multi-line strings and comments, improving readability and maintainability.
+
+## Constraints
+
+- Backwards compatibility: existing JSON personas must continue to work
+- Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486)
+- Consistency: use `.yaml` extension (not `.yml`)
+- Library: use `gopkg.in/yaml.v3` (approved in CONVENTIONS.md) with explicit depth limiting
+
+## Proposed Approach
+
+1. **Update `parsePersona`** to detect format from file extension
+2. **Add YAML parsing** with explicit depth limit (defense in depth)
+3. **Keep JSON as fallback** for files without `.yaml`/`.yml` extension
+4. **Convert built-in personas** to YAML format
+5. **Update embed directive** to include both formats
+
+### File Extension Detection
+
+```go
+func parsePersona(data []byte, source string) (*Persona, error) {
+    isYAML := strings.HasSuffix(source, ".yaml") || strings.HasSuffix(source, ".yml")
+    if isYAML {
+        return parseYAML(data, source)
+    }
+    return parseJSON(data, source)
+}
+```
+
+### YAML Parsing with Depth Protection
+
+```go
+func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
+    var node yaml.Node
+    dec := yaml.NewDecoder(bytes.NewReader(data))
+    if err := dec.Decode(&node); err != nil {
+        return err
+    }
+    if err := checkYAMLDepth(&node, 0, maxDepth); err != nil {
+        return err
+    }
+    return node.Decode(out)
+}
+
+func checkYAMLDepth(node *yaml.Node, depth, maxDepth int) error {
+    if depth > maxDepth {
+        return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth)
+    }
+    // Handle alias nodes by following the Alias pointer
+    if node.Kind == yaml.AliasNode && node.Alias != nil {
+        return checkYAMLDepth(node.Alias, depth, maxDepth)
+    }
+    for _, child := range node.Content {
+        if err := checkYAMLDepth(child, depth+1, maxDepth); err != nil {
+            return err
+        }
+    }
+    return nil
+}
+```
+
+The `gopkg.in/yaml.v3` library does not have built-in depth protection, so we implement explicit depth checking by first decoding into a `yaml.Node`, walking the tree to verify depth (including alias resolution), then decoding into the target struct.
+
+## State/Data Model
+
+No new state. Same `Persona` struct, just different parsing.
+
+## Error Cases
+
+| Error | Handling |
+|-------|----------|
+| Invalid YAML syntax | Return parse error with source file |
+| Deeply nested YAML | Library rejects (v1.16.0+ fix) |
+| Unknown extension | Fall back to JSON parsing |
+| Missing required fields | Validation rejects after parse |
+
+## Edge Cases
+
+- File with `.json` extension but YAML content → JSON parse fails, user sees error
+- File with no extension → defaults to JSON
+- Embedded persona reference like `builtin:security` → detect by embed path (`personas/X.yaml`)
+
+## Testing Strategy
+
+1. Unit tests for YAML parsing (valid, invalid, deeply nested)
+2. Unit tests for extension detection
+3. Integration test for built-in personas (now YAML)
+4. Backwards compat test: verify JSON still works for external files
+
+## Completion Checklist
+
+1. [ ] `go-yaml` dependency added at v1.16.0+
+2. [ ] Extension detection uses case-insensitive comparison
+3. [ ] YAML parse errors include source file name
+4. [ ] JSON parsing still works for `.json` files
+5. [ ] Built-in personas converted to YAML with readable multi-line strings
+6. [ ] Embed directive updated to include `*.yaml`
+7. [ ] Test for deeply nested YAML rejection
+8. [ ] All existing tests pass
+
+## Open Questions
+
+- Should we support both `.yaml` AND `.yml`? Issue says `.yaml` only for consistency, but some users expect `.yml`. **Decision:** Support both for reading, recommend `.yaml` in docs.
+- Should we add a "format" field to detect mismatched extension/content? **Decision:** No, keep it simple. Extension determines format.
@@ -1,3 +1,5 @@
 module gitea.weiker.me/rodin/review-bot

 go 1.26.2
+
+require gopkg.in/yaml.v3 v3.0.1
@@ -0,0 +1,4 @@
+gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
+gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
+gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
+gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
@@ -1,81 +1,153 @@
 package review

 import (
+	"bytes"
 	"embed"
 	"encoding/json"
 	"fmt"
 	"os"
+	"sort"
 	"strings"
 	"unicode/utf8"
+
+	"gopkg.in/yaml.v3"
 )

-//go:embed personas/*.json
+//go:embed personas/*.yaml
 var embeddedPersonas embed.FS

+// MaxPersonaFileSize is the maximum size for persona files (64 KB).
+// This prevents denial-of-service via excessively large files.
+const MaxPersonaFileSize = 64 * 1024
+
+// MaxYAMLDepth is the maximum nesting depth allowed in YAML persona files.
+// This prevents stack exhaustion from deeply nested structures.
+const MaxYAMLDepth = 20
+
+// MaxYAMLNodes is the maximum number of YAML nodes allowed in persona files.
+// This prevents DoS via wide-but-shallow structures that bypass depth limits.
+const MaxYAMLNodes = 1000
+
 // Persona defines a specialized review role with focused expertise.
 type Persona struct {
-	Name         string   `json:"name"`
-	DisplayName  string   `json:"display_name"`
-	ModelPref    string   `json:"model_preference,omitempty"`
-	Identity     string   `json:"identity"`
-	Focus        []string `json:"focus"`
-	Ignore       []string `json:"ignore"`
-	Severity     Severity `json:"severity"`
-	OutputFormat string   `json:"output_format,omitempty"`
+	Name         string   `json:"name" yaml:"name"`
+	DisplayName  string   `json:"display_name" yaml:"display_name"`
+	ModelPref    string   `json:"model_preference,omitempty" yaml:"model_preference,omitempty"`
+	Identity     string   `json:"identity" yaml:"identity"`
+	Focus        []string `json:"focus" yaml:"focus"`
+	Ignore       []string `json:"ignore" yaml:"ignore"`
+	Severity     Severity `json:"severity" yaml:"severity"`
+	OutputFormat string   `json:"output_format,omitempty" yaml:"output_format,omitempty"`
 }

 // Severity defines what constitutes each severity level for this persona.
 // These are prompt guidance for the LLM, not output format changes.
 type Severity struct {
-	Major string `json:"major"`
-	Minor string `json:"minor"`
-	Nit   string `json:"nit"`
+	Major string `json:"major" yaml:"major"`
+	Minor string `json:"minor" yaml:"minor"`
+	Nit   string `json:"nit" yaml:"nit"`
 }

-// LoadPersona loads a persona from a JSON file path.
+// LoadPersona loads a persona from a JSON or YAML file path.
+// Format is detected by file extension: .yaml/.yml for YAML, .json or other for JSON.
+// Files larger than MaxPersonaFileSize are rejected.
+//
+// Symlinks are supported: os.Stat follows symlinks, so a symlink pointing to
+// a regular file will pass the IsRegular() check. Symlinks to non-regular files
+// (directories, FIFOs, devices) are still rejected.
 func LoadPersona(path string) (*Persona, error) {
+	// os.Stat follows symlinks, so symlinks to regular files are supported.
+	// The IsRegular() check operates on the target, not the symlink itself.
+	info, err := os.Stat(path)
+	if err != nil {
+		return nil, fmt.Errorf("read persona file %s: %w", path, err)
+	}
+	if !info.Mode().IsRegular() {
+		return nil, fmt.Errorf("persona file %s is not a regular file", path)
+	}
+	if info.Size() > MaxPersonaFileSize {
+		return nil, fmt.Errorf("persona file %s exceeds maximum size (%d bytes)", path, MaxPersonaFileSize)
+	}
 	data, err := os.ReadFile(path)
 	if err != nil {
 		return nil, fmt.Errorf("read persona file %s: %w", path, err)
 	}
+	// Re-check size after read to defend against TOCTOU races where file
+	// grows between stat and read (e.g., appending process, replaced file).
+	if len(data) > MaxPersonaFileSize {
+		return nil, fmt.Errorf("persona file %s exceeds maximum size (%d bytes)", path, MaxPersonaFileSize)
+	}
 	return parsePersona(data, path)
 }

 // LoadBuiltinPersona loads a built-in persona by name.
 // Returns an error if the persona doesn't exist.
+// Built-in personas are stored in YAML format only (see embed directive).
 func LoadBuiltinPersona(name string) (*Persona, error) {
-	filename := name + ".json"
-	data, err := embeddedPersonas.ReadFile("personas/" + filename) // embed.FS paths use forward slashes per io/fs spec
+	yamlFile := name + ".yaml"
+	data, err := embeddedPersonas.ReadFile("personas/" + yamlFile)
 	if err != nil {
 		available := ListBuiltinPersonas()
 		return nil, fmt.Errorf("unknown built-in persona %q (available: %s)", name, strings.Join(available, ", "))
 	}
-	return parsePersona(data, "builtin:"+name)
+	return parsePersona(data, "builtin:"+yamlFile)
 }

-// ListBuiltinPersonas returns the names of all built-in personas.
+// ListBuiltinPersonas returns the names of all built-in personas in sorted order.
 // Returns an empty slice if the embedded directory cannot be read.
 func ListBuiltinPersonas() []string {
 	entries, err := embeddedPersonas.ReadDir("personas")
 	if err != nil {
 		return []string{}
 	}
-	var names []string
+	seen := make(map[string]bool)
 	for _, e := range entries {
 		if e.IsDir() {
 			continue
 		}
 		name := e.Name()
-		if strings.HasSuffix(name, ".json") {
-			names = append(names, strings.TrimSuffix(name, ".json"))
+		// Strip extension to get persona name
+		var personaName string
+		switch {
+		case strings.HasSuffix(name, ".yaml"):
+			personaName = strings.TrimSuffix(name, ".yaml")
+		case strings.HasSuffix(name, ".yml"):
+			personaName = strings.TrimSuffix(name, ".yml")
+		case strings.HasSuffix(name, ".json"):
+			personaName = strings.TrimSuffix(name, ".json")
+		default:
+			continue
+		}
+		if !seen[personaName] {
+			seen[personaName] = true
 		}
 	}
+	names := make([]string, 0, len(seen))
+	for name := range seen {
+		names = append(names, name)
+	}
+	sort.Strings(names)
 	return names
 }

+// parsePersona parses persona data from JSON or YAML format.
+// Format is detected by the source file extension.
 func parsePersona(data []byte, source string) (*Persona, error) {
+	lowerSource := strings.ToLower(source)
+	isYAML := strings.HasSuffix(lowerSource, ".yaml") || strings.HasSuffix(lowerSource, ".yml")
+
 	var p Persona
-	if err := json.Unmarshal(data, &p); err != nil {
+	var err error
+	if isYAML {
+		err = unmarshalYAMLWithDepthLimit(data, &p, MaxYAMLDepth)
+	} else {
+		// Use json.Decoder with DisallowUnknownFields for consistency with
+		// YAML's KnownFields(true) - both reject unknown fields to catch typos.
+		dec := json.NewDecoder(bytes.NewReader(data))
+		dec.DisallowUnknownFields()
+		err = dec.Decode(&p)
+	}
+	if err != nil {
 		return nil, fmt.Errorf("parse persona %s: %w", source, err)
 	}
 	if err := validatePersona(&p, source); err != nil {
@@ -84,6 +156,81 @@ func parsePersona(data []byte, source string) (*Persona, error) {
 	return &p, nil
 }

+// unmarshalYAMLWithDepthLimit unmarshals YAML data with explicit depth limiting
+// and strict field checking. This protects against stack exhaustion from deeply
+// nested structures and catches typos in field names.
+// Multi-document YAML files are rejected to prevent silent data loss.
+func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
+	// First pass: decode into a yaml.Node to check depth limits and node counts.
+	// This prevents stack exhaustion before we attempt to decode into structs.
+	var node yaml.Node
+	dec := yaml.NewDecoder(bytes.NewReader(data))
+	if err := dec.Decode(&node); err != nil {
+		return err
+	}
+
+	// Reject multi-document YAML files - silently ignoring additional documents
+	// could lead to confusing behavior where users think their changes take effect.
+	var extra yaml.Node
+	if dec.Decode(&extra) == nil {
+		return fmt.Errorf("multi-document YAML is not supported; only single-document files are allowed")
+	}
+
+	nodeCount := 0
+	if err := checkYAMLDepth(&node, 0, maxDepth, MaxYAMLNodes, make(map[*yaml.Node]struct{}), &nodeCount); err != nil {
+		return err
+	}
+
+	// Second pass: decode with strict field checking enabled.
+	// KnownFields(true) rejects unknown keys, catching typos like "focuss" or "identiy".
+	// We must re-decode from the original data because yaml.Node.Decode() doesn't
+	// support the KnownFields option.
+	strictDec := yaml.NewDecoder(bytes.NewReader(data))
+	strictDec.KnownFields(true)
+	return strictDec.Decode(out)
+}
+
+// checkYAMLDepth recursively checks that YAML nodes don't exceed the depth limit
+// or the total node count limit. It also detects alias cycles to prevent infinite
+// recursion from crafted YAML with self-referential aliases.
+func checkYAMLDepth(node *yaml.Node, depth, maxDepth, maxNodes int, seen map[*yaml.Node]struct{}, nodeCount *int) error {
+	if depth > maxDepth {
+		return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth)
+	}
+
+	// Track total nodes visited as defense-in-depth against wide-but-shallow attacks.
+	*nodeCount++
+	if *nodeCount > maxNodes {
+		return fmt.Errorf("YAML node count exceeds maximum (%d)", maxNodes)
+	}
+
+	// Cycle detection: if we've seen this node before, we're in a cycle.
+	if _, ok := seen[node]; ok {
+		return nil // Already validated this subtree, skip to avoid infinite recursion.
+	}
+	seen[node] = struct{}{}
+
+	// Handle alias nodes: follow the alias to its anchor target.
+	// Increment depth when following aliases since they expand the effective structure.
+	if node.Kind == yaml.AliasNode && node.Alias != nil {
+		return checkYAMLDepth(node.Alias, depth+1, maxDepth, maxNodes, seen, nodeCount)
+	}
+
+	for _, child := range node.Content {
+		if err := checkYAMLDepth(child, depth+1, maxDepth, maxNodes, seen, nodeCount); err != nil {
+			return err
+		}
+	}
+	return nil
+}
+
+// ParsePersonaBytes parses persona data from bytes with a source label for errors.
+// This is useful for parsing personas fetched from external sources (e.g., Gitea API)
+// without requiring filesystem access. Format is detected by source extension.
+func ParsePersonaBytes(data []byte, source string) (*Persona, error) {
+	return parsePersona(data, source)
+}
+
 func validatePersona(p *Persona, source string) error {
 	if p.Name == "" {
 		return fmt.Errorf("persona %s: name is required", source)
@@ -1,10 +1,13 @@
 package review

 import (
+	"fmt"
 	"os"
 	"path/filepath"
 	"strings"
 	"testing"
+
+	"gopkg.in/yaml.v3"
 )

 func TestLoadBuiltinPersona(t *testing.T) {
@@ -87,6 +90,83 @@ func TestListBuiltinPersonas(t *testing.T) {
 	}
 }

+func TestLoadPersonaFromYAMLFile(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "test.yaml")
+
+	content := `# Test persona
+name: test
+display_name: Test Persona
+identity: |
+  You are a test persona.
+  Multi-line identity works.
+focus:
+  - testing
+  - validation
+ignore:
+  - nothing
+severity:
+  major: Big problems
+  minor: Small problems
+  nit: Tiny problems
+`
+
+	if err := os.WriteFile(path, []byte(content), 0644); err != nil {
+		t.Fatalf("failed to write test file: %v", err)
+	}
+
+	p, err := LoadPersona(path)
+	if err != nil {
+		t.Fatalf("LoadPersona failed: %v", err)
+	}
+
+	if p.Name != "test" {
+		t.Errorf("Name = %q, want %q", p.Name, "test")
+	}
+	if p.DisplayName != "Test Persona" {
+		t.Errorf("DisplayName = %q, want %q", p.DisplayName, "Test Persona")
+	}
+	if len(p.Focus) != 2 {
+		t.Errorf("Focus len = %d, want 2", len(p.Focus))
+	}
+	if !strings.Contains(p.Identity, "Multi-line") {
+		t.Error("Identity should contain multi-line content")
+	}
+}
+
+func TestLoadPersonaFromYMLFile(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "test.yml")
+
+	content := `name: test
+display_name: Test YML
+identity: Test identity
+focus:
+  - testing
+ignore: []
+severity:
+  major: Big
+  minor: Small
+  nit: Tiny
+`
+
+	if err := os.WriteFile(path, []byte(content), 0644); err != nil {
+		t.Fatalf("failed to write test file: %v", err)
+	}
+
+	p, err := LoadPersona(path)
+	if err != nil {
+		t.Fatalf("LoadPersona failed: %v", err)
+	}
+
+	if p.Name != "test" {
+		t.Errorf("Name = %q, want %q", p.Name, "test")
+	}
+	if p.DisplayName != "Test YML" {
+		t.Errorf("DisplayName = %q, want %q", p.DisplayName, "Test YML")
+	}
+}
+
 func TestLoadPersonaFromJSONFile(t *testing.T) {
 	dir := t.TempDir()
 	path := filepath.Join(dir, "test.json")
@@ -96,6 +176,7 @@ func TestLoadPersonaFromJSONFile(t *testing.T) {
 		"display_name": "Test Persona",
 		"identity": "You are a test persona.\nMulti-line identity works.",
 		"focus": ["testing", "validation"],
+
 		"ignore": ["nothing"],
 		"severity": {
 			"major": "Big problems",
@@ -130,22 +211,38 @@ func TestLoadPersonaFromJSONFile(t *testing.T) {
 func TestLoadPersonaValidation(t *testing.T) {
 	tests := []struct {
 		name    string
-		json    string
+		content string
+		ext     string
 		wantErr string
 	}{
 		{
-			name:    "missing name",
-			json:    `{"identity": "test"}`,
+			name:    "missing name yaml",
+			content: "identity: test\n",
+			ext:     ".yaml",
 			wantErr: "name is required",
 		},
 		{
-			name:    "missing identity",
-			json:    `{"name": "test"}`,
+			name:    "missing identity yaml",
+			content: "name: test\n",
+			ext:     ".yaml",
 			wantErr: "identity is required",
 		},
 		{
-			name: "display_name defaults to name",
-			json: `{"name": "test", "identity": "test identity"}`,
+			name:    "missing name json",
+			content: `{"identity": "test"}`,
+			ext:     ".json",
+			wantErr: "name is required",
+		},
+		{
+			name:    "missing identity json",
+			content: `{"name": "test"}`,
+			ext:     ".json",
+			wantErr: "identity is required",
+		},
+		{
+			name:    "display_name defaults to name",
+			content: "name: test\nidentity: test identity\n",
+			ext:     ".yaml",
 			// No error expected - should succeed
 		},
 	}
@@ -153,8 +250,8 @@ func TestLoadPersonaValidation(t *testing.T) {
 	for _, tt := range tests {
 		t.Run(tt.name, func(t *testing.T) {
 			dir := t.TempDir()
-			path := filepath.Join(dir, "test.json")
-			if err := os.WriteFile(path, []byte(tt.json), 0644); err != nil {
+			path := filepath.Join(dir, "test"+tt.ext)
+			if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
 				t.Fatalf("failed to write test file: %v", err)
 			}

@@ -184,12 +281,25 @@ func TestLoadPersonaValidation(t *testing.T) {
 }

 func TestLoadPersonaFileNotFound(t *testing.T) {
-	_, err := LoadPersona("/nonexistent/path/persona.json")
+	_, err := LoadPersona("/nonexistent/path/persona.yaml")
 	if err == nil {
 		t.Error("expected error for nonexistent file")
 	}
 }

+func TestLoadPersonaInvalidYAML(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "invalid.yaml")
+	if err := os.WriteFile(path, []byte("not valid yaml:\n  - [broken"), 0644); err != nil {
+		t.Fatalf("failed to write test file: %v", err)
+	}
+
+	_, err := LoadPersona(path)
+	if err == nil {
+		t.Error("expected error for invalid YAML")
+	}
+}
+
 func TestLoadPersonaInvalidJSON(t *testing.T) {
 	dir := t.TempDir()
 	path := filepath.Join(dir, "invalid.json")
@@ -203,6 +313,38 @@ func TestLoadPersonaInvalidJSON(t *testing.T) {
 	}
 }

+func TestLoadPersonaCaseInsensitiveExtension(t *testing.T) {
+	tests := []struct {
+		name string
+		ext  string
+	}{
+		{"lowercase yaml", ".yaml"},
+		{"uppercase YAML", ".YAML"},
+		{"mixed case Yaml", ".Yaml"},
+		{"lowercase yml", ".yml"},
+		{"uppercase YML", ".YML"},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			dir := t.TempDir()
+			path := filepath.Join(dir, "test"+tt.ext)
+			content := "name: test\nidentity: test identity\n"
+			if err := os.WriteFile(path, []byte(content), 0644); err != nil {
+				t.Fatalf("failed to write test file: %v", err)
+			}
+
+			p, err := LoadPersona(path)
+			if err != nil {
+				t.Fatalf("LoadPersona failed for extension %s: %v", tt.ext, err)
+			}
+			if p.Name != "test" {
+				t.Errorf("Name = %q, want %q", p.Name, "test")
+			}
+		})
+	}
+}
+
 func TestCapitalizeFirst(t *testing.T) {
 	tests := []struct {
 		input string
@@ -237,3 +379,400 @@ func TestListBuiltinPersonasReturnsEmptySlice(t *testing.T) {
 		t.Error("ListBuiltinPersonas should return empty slice, not nil")
 	}
 }
+
+func TestYAMLMultilineStrings(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "multiline.yaml")
+
+	// Test literal block scalar (|) which preserves newlines
+	content := `name: multiline
+display_name: Multiline Test
+identity: |
+  First line.
+  Second line.
+  Third line.
+focus:
+  - item one
+ignore: []
+severity:
+  major: Major issue
+  minor: Minor issue
+  nit: Nit
+`
+
+	if err := os.WriteFile(path, []byte(content), 0644); err != nil {
+		t.Fatalf("failed to write test file: %v", err)
+	}
+
+	p, err := LoadPersona(path)
+	if err != nil {
+		t.Fatalf("LoadPersona failed: %v", err)
+	}
+
+	// Literal block scalar preserves newlines
+	if !strings.Contains(p.Identity, "\n") {
+		t.Error("Identity should contain newlines from literal block scalar")
+	}
+	if !strings.Contains(p.Identity, "Second line") {
+		t.Error("Identity should contain 'Second line'")
+	}
+}
+
+func TestYAMLComments(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "comments.yaml")
+
+	content := `# This is a comment
+name: commented  # inline comment
+display_name: Commented Persona
+# Another comment
+identity: Test identity
+focus:
+  - item  # comment after item
+ignore: []
+severity:
+  major: Major
+  minor: Minor
+  nit: Nit
+`
+
+	if err := os.WriteFile(path, []byte(content), 0644); err != nil {
+		t.Fatalf("failed to write test file: %v", err)
+	}
+
+	p, err := LoadPersona(path)
+	if err != nil {
+		t.Fatalf("LoadPersona failed: %v", err)
+	}
+
+	// Comments should be ignored
+	if p.Name != "commented" {
+		t.Errorf("Name = %q, want %q", p.Name, "commented")
+	}
+	if p.Focus[0] != "item" {
+		t.Errorf("Focus[0] = %q, want %q", p.Focus[0], "item")
+	}
+}
+
+func TestYAMLDeeplyNestedRejection(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "deeply-nested.yaml")
+
+	// Build a deeply nested YAML structure that exceeds MaxYAMLDepth (20).
+	// Each level adds 2 to the depth count (key + value mapping).
+	var sb strings.Builder
+	sb.WriteString("name: test\nidentity: test\nnested:\n")
+	indent := "  "
+	for i := 0; i < 25; i++ {
+		sb.WriteString(strings.Repeat(indent, i+1))
+		sb.WriteString(fmt.Sprintf("level%d:\n", i))
+	}
+	sb.WriteString(strings.Repeat(indent, 26))
+	sb.WriteString("value: too-deep\n")
+
+	if err := os.WriteFile(path, []byte(sb.String()), 0644); err != nil {
+		t.Fatalf("failed to write test file: %v", err)
+	}
+
+	_, err := LoadPersona(path)
+	if err == nil {
+		t.Error("expected error for deeply nested YAML, got nil")
+	}
+	if !strings.Contains(err.Error(), "nesting depth exceeds") {
+		t.Errorf("error = %q, want containing 'nesting depth exceeds'", err.Error())
+	}
+}
+
+func TestYAMLFileSizeLimit(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "huge.yaml")
+
+	// Create a file larger than MaxPersonaFileSize (64 KB)
+	content := "name: test\nidentity: " + strings.Repeat("x", MaxPersonaFileSize+1) + "\n"
+	if err := os.WriteFile(path, []byte(content), 0644); err != nil {
+		t.Fatalf("failed to write test file: %v", err)
+	}
+
+	_, err := LoadPersona(path)
+	if err == nil {
+		t.Error("expected error for oversized file, got nil")
+	}
+	if !strings.Contains(err.Error(), "exceeds maximum size") {
+		t.Errorf("error = %q, want containing 'exceeds maximum size'", err.Error())
+	}
+}
+
+func TestYAMLAliasCycleDetection(t *testing.T) {
+	// Test that our checkYAMLDepth function handles alias cycles gracefully
+	// by using the seen map to prevent infinite recursion.
+	// We test this directly because go-yaml's parser handles most cycles
+	// at parse time, but we need to ensure our checker is robust.
+
+	// Create a node structure where an alias points to a parent node,
+	// simulating what could happen with malicious input that bypasses
+	// go-yaml's cycle detection.
+	parent := &yaml.Node{
+		Kind: yaml.MappingNode,
+		Content: []*yaml.Node{
+			{Kind: yaml.ScalarNode, Value: "name"},
+			{Kind: yaml.ScalarNode, Value: "test"},
+			{Kind: yaml.ScalarNode, Value: "nested"},
+		},
+	}
+
+	// Create a child that aliases back to the parent (artificial cycle)
+	aliasToParent := &yaml.Node{
+		Kind:  yaml.AliasNode,
+		Alias: parent,
+	}
+	parent.Content = append(parent.Content, aliasToParent)
+
+	nodeCount := 0
+	seen := make(map[*yaml.Node]struct{})
+
+	// This should NOT hang or stack overflow - the seen map prevents infinite recursion
+	err := checkYAMLDepth(parent, 0, MaxYAMLDepth, MaxYAMLNodes, seen, &nodeCount)
+	if err != nil {
+		t.Errorf("unexpected error traversing cyclic structure: %v", err)
+	}
+
+	// Verify we tracked the parent in the seen map
+	if _, ok := seen[parent]; !ok {
+		t.Error("parent node not tracked in seen map")
+	}
+}
+
+func TestYAMLMultiDocumentRejection(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "multi.yaml")
+
+	// Multi-document YAML (documents separated by ---)
+	content := `name: first
+identity: first document
+---
+name: second
+identity: second document
+`
+	if err := os.WriteFile(path, []byte(content), 0644); err != nil {
+		t.Fatalf("failed to write test file: %v", err)
+	}
+
+	_, err := LoadPersona(path)
+	if err == nil {
+		t.Error("expected error for multi-document YAML, got nil")
+	}
+	if !strings.Contains(err.Error(), "multi-document") {
+		t.Errorf("error = %q, want containing 'multi-document'", err.Error())
+	}
+}
+
+func TestYAMLNodeCountLimit(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "wide.yaml")
+
+	// Build a YAML structure that's shallow but wide - many keys at the same level
+	// to test the node count limit (should exceed MaxYAMLNodes = 1000)
+	var sb strings.Builder
+	sb.WriteString("name: test\nidentity: test\n")
+	for i := 0; i < 600; i++ {
+		sb.WriteString(fmt.Sprintf("key%d: value%d\n", i, i))
+	}
+
+	if err := os.WriteFile(path, []byte(sb.String()), 0644); err != nil {
+		t.Fatalf("failed to write test file: %v", err)
+	}
+
+	_, err := LoadPersona(path)
+	if err == nil {
+		t.Error("expected error for wide YAML exceeding node count, got nil")
+	}
+	if !strings.Contains(err.Error(), "node count exceeds") {
+		t.Errorf("error = %q, want containing 'node count exceeds'", err.Error())
+	}
+}
+
+func TestCheckYAMLDepthCycleDetectionDirect(t *testing.T) {
+	// Direct test of cycle detection in checkYAMLDepth by creating
+	// a node structure with an artificial cycle.
+	// This tests the seen map logic independent of go-yaml's parsing.
+	node := &yaml.Node{
+		Kind: yaml.MappingNode,
+		Content: []*yaml.Node{
+			{Kind: yaml.ScalarNode, Value: "key"},
+			{Kind: yaml.ScalarNode, Value: "value"},
+		},
+	}
+
+	// Create a cycle by making a child reference the parent
+	cycleChild := &yaml.Node{
+		Kind:  yaml.AliasNode,
+		Alias: node, // Points back to the parent
+	}
+	node.Content = append(node.Content,
+		&yaml.Node{Kind: yaml.ScalarNode, Value: "cyclic"},
+		cycleChild,
+	)
+
+	nodeCount := 0
+	seen := make(map[*yaml.Node]struct{})
+	err := checkYAMLDepth(node, 0, MaxYAMLDepth, MaxYAMLNodes, seen, &nodeCount)
+
+	// Should complete without infinite recursion due to cycle detection
+	if err != nil {
+		t.Errorf("unexpected error: %v", err)
+	}
+	// The seen map should contain multiple entries
+	if len(seen) < 2 {
+		t.Errorf("seen map has %d entries, expected at least 2", len(seen))
+	}
+}
+
+func TestListBuiltinPersonasSortedOrder(t *testing.T) {
+	names := ListBuiltinPersonas()
+	if len(names) < 2 {
+		t.Skip("need at least 2 personas to test ordering")
+	}
+
+	// Verify the list is sorted
+	for i := 1; i < len(names); i++ {
+		if names[i-1] > names[i] {
+			t.Errorf("ListBuiltinPersonas not sorted: %q > %q", names[i-1], names[i])
+		}
+	}
+}
+
+func TestYAMLUnknownFieldsRejected(t *testing.T) {
+	tests := []struct {
+		name    string
+		content string
+		wantErr string
+	}{
+		{
+			name: "unknown top-level field",
+			content: `name: test
+identity: test identity
+unknown_field: should fail
+`,
+			wantErr: "unknown_field",
+		},
+		{
+			name: "typo in field name",
+			content: `name: test
+identiy: typo should fail
+`,
+			wantErr: "identiy",
+		},
+		{
+			name: "unknown field in severity",
+			content: `name: test
+identity: test
+severity:
+  major: Major
+  minro: typo
+`,
+			wantErr: "minro",
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			dir := t.TempDir()
+			path := filepath.Join(dir, "unknown.yaml")
+			if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
+				t.Fatalf("failed to write test file: %v", err)
+			}
+
+			_, err := LoadPersona(path)
+			if err == nil {
+				t.Errorf("expected error for unknown field %q, got nil", tt.wantErr)
+				return
+			}
+			if !strings.Contains(err.Error(), tt.wantErr) {
+				t.Errorf("error = %q, want containing %q", err.Error(), tt.wantErr)
+			}
+		})
+	}
+}
+
+func TestJSONUnknownFieldsRejected(t *testing.T) {
+	tests := []struct {
+		name    string
+		content string
+		wantErr string
+	}{
+		{
+			name: "unknown top-level field",
+			content: `{
+				"name": "test",
+				"identity": "test identity",
+				"unknown_field": "should fail"
+			}`,
+			wantErr: "unknown_field",
+		},
+		{
+			name: "typo in field name",
+			content: `{
+				"name": "test",
+				"identiy": "typo should fail"
+			}`,
+			wantErr: "identiy",
+		},
+		{
+			name: "unknown field in severity",
+			content: `{
+				"name": "test",
+				"identity": "test",
+				"severity": {
+					"major": "ok",
+					"miner": "typo"
+				}
+			}`,
+			wantErr: "miner",
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			dir := t.TempDir()
+			path := filepath.Join(dir, "test.json")
+			if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
+				t.Fatalf("failed to write test file: %v", err)
+			}
+
+			_, err := LoadPersona(path)
+			if err == nil {
+				t.Fatal("expected error for unknown field, got nil")
+			}
+			if !strings.Contains(err.Error(), tt.wantErr) {
+				t.Errorf("error = %q, want to contain %q", err.Error(), tt.wantErr)
+			}
+		})
+	}
+}
+
+func TestLoadPersonaSymlink(t *testing.T) {
+	// Create a regular persona file
+	dir := t.TempDir()
+	realFile := filepath.Join(dir, "real.yaml")
+	content := `name: test
+identity: test identity
+`
+	if err := os.WriteFile(realFile, []byte(content), 0644); err != nil {
+		t.Fatalf("failed to write test file: %v", err)
+	}
+
+	// Create a symlink to it
+	symlink := filepath.Join(dir, "link.yaml")
+	if err := os.Symlink(realFile, symlink); err != nil {
+		t.Fatalf("failed to create symlink: %v", err)
+	}
+
+	// LoadPersona should work via symlink
+	p, err := LoadPersona(symlink)
+	if err != nil {
+		t.Fatalf("LoadPersona via symlink failed: %v", err)
+	}
+	if p.Name != "test" {
+		t.Errorf("Name = %q, want %q", p.Name, "test")
+	}
+}
@@ -1,26 +0,0 @@
-{
-  "name": "architect",
-  "display_name": "Software Architect",
-  "identity": "You are a software architect reviewing code for design quality.\n\nYour expertise:\n- Design patterns and anti-patterns\n- Code organization and module boundaries\n- API design and contracts\n- Testability and dependency injection\n- Consistency with existing architecture\n- Technical debt identification",
-  "focus": [
-    "Design pattern violations or misuse",
-    "Module boundary violations (inappropriate coupling)",
-    "API design issues (unclear contracts, leaky abstractions)",
-    "Testability problems (hidden dependencies, god objects)",
-    "Inconsistency with existing codebase patterns",
-    "Unnecessary complexity or over-engineering",
-    "Missing abstractions or premature abstraction"
-  ],
-  "ignore": [
-    "Security vulnerabilities (security persona handles these)",
-    "Performance micro-optimizations",
-    "Code style and formatting",
-    "Documentation typos",
-    "Test implementation details"
-  ],
-  "severity": {
-    "major": "Architectural violations that will cause maintenance problems or make the codebase harder to evolve",
-    "minor": "Design issues that reduce clarity or testability but don't block progress",
-    "nit": "Minor pattern deviations or style preferences"
-  }
-}
@@ -0,0 +1,37 @@
+# Software Architect Persona
+# Focuses on design quality, patterns, and code organization
+
+name: architect
+display_name: Software Architect
+
+identity: |
+  You are a software architect reviewing code for design quality.
+
+  Your expertise:
+  - Design patterns and anti-patterns
+  - Code organization and module boundaries
+  - API design and contracts
+  - Testability and dependency injection
+  - Consistency with existing architecture
+  - Technical debt identification
+
+focus:
+  - Design pattern violations or misuse
+  - Module boundary violations (inappropriate coupling)
+  - API design issues (unclear contracts, leaky abstractions)
+  - Testability problems (hidden dependencies, god objects)
+  - Inconsistency with existing codebase patterns
+  - Unnecessary complexity or over-engineering
+  - Missing abstractions or premature abstraction
+
+ignore:
+  - Security vulnerabilities (security persona handles these)
+  - Performance micro-optimizations
+  - Code style and formatting
+  - Documentation typos
+  - Test implementation details
+
+severity:
+  major: "Architectural violations that will cause maintenance problems or make the codebase harder to evolve"
+  minor: "Design issues that reduce clarity or testability but don't block progress"
+  nit: "Minor pattern deviations or style preferences"
@@ -1,26 +0,0 @@
-{
-  "name": "docs",
-  "display_name": "Documentation Reviewer",
-  "identity": "You are a documentation specialist reviewing code for clarity and documentation quality.\n\nYour expertise:\n- API documentation and examples\n- Code comments and their accuracy\n- Error message clarity\n- README and guide quality\n- Naming clarity and self-documenting code",
-  "focus": [
-    "Missing or outdated documentation",
-    "Unclear or misleading comments",
-    "Poor error messages (cryptic, unhelpful, missing context)",
-    "Confusing naming (functions, variables, types)",
-    "Missing examples for complex APIs",
-    "Inconsistent terminology",
-    "Documentation that contradicts the code"
-  ],
-  "ignore": [
-    "Security vulnerabilities",
-    "Performance issues",
-    "Design patterns",
-    "Test coverage",
-    "Code style (unless it affects readability)"
-  ],
-  "severity": {
-    "major": "Documentation that actively misleads or missing docs for critical functionality",
-    "minor": "Unclear documentation or poor error messages that will confuse users",
-    "nit": "Minor clarity improvements or typo fixes"
-  }
-}
@@ -0,0 +1,36 @@
+# Documentation Reviewer Persona
+# Focuses on clarity, documentation quality, and self-documenting code
+
+name: docs
+display_name: Documentation Reviewer
+
+identity: |
+  You are a documentation specialist reviewing code for clarity and documentation quality.
+
+  Your expertise:
+  - API documentation and examples
+  - Code comments and their accuracy
+  - Error message clarity
+  - README and guide quality
+  - Naming clarity and self-documenting code
+
+focus:
+  - Missing or outdated documentation
+  - Unclear or misleading comments
+  - Poor error messages (cryptic, unhelpful, missing context)
+  - Confusing naming (functions, variables, types)
+  - Missing examples for complex APIs
+  - Inconsistent terminology
+  - Documentation that contradicts the code
+
+ignore:
+  - Security vulnerabilities
+  - Performance issues
+  - Design patterns
+  - Test coverage
+  - Code style (unless it affects readability)
+
+severity:
+  major: "Documentation that actively misleads or missing docs for critical functionality"
+  minor: "Unclear documentation or poor error messages that will confuse users"
+  nit: "Minor clarity improvements or typo fixes"
@@ -1,26 +0,0 @@
-{
-  "name": "security",
-  "display_name": "Security Specialist",
-  "identity": "You are a security specialist reviewing code for vulnerabilities.\n\nYour expertise:\n- OWASP Top 10 vulnerabilities\n- Injection attacks (SQL, command, path traversal, template)\n- Authentication and authorization patterns\n- Secrets management and exposure risks\n- Race conditions with security implications\n- Event sourcing attack vectors (replay attacks, event injection)",
-  "focus": [
-    "Injection attacks (SQL, command, path traversal, template injection)",
-    "Authentication and authorization gaps or bypasses",
-    "Secrets exposure (hardcoded credentials, tokens in logs, config leaks)",
-    "Input validation failures (unsanitized input, unsafe deserialization)",
-    "Race conditions that could be exploited",
-    "Cryptographic weaknesses (weak algorithms, improper key handling)",
-    "Information disclosure through error messages or logs"
-  ],
-  "ignore": [
-    "Code style and naming conventions",
-    "Performance optimizations (unless security-related)",
-    "Documentation quality",
-    "General code quality or readability",
-    "Test coverage"
-  ],
-  "severity": {
-    "major": "Exploitable vulnerabilities: auth bypass, injection, data exfiltration, privilege escalation, RCE",
-    "minor": "Defense-in-depth issues: missing rate limiting, verbose errors, weak input validation",
-    "nit": "Theoretical risks with low exploitability or impact"
-  }
-}
@@ -0,0 +1,37 @@
+# Security Specialist Persona
+# Focuses on vulnerabilities, auth issues, and security best practices
+
+name: security
+display_name: Security Specialist
+
+identity: |
+  You are a security specialist reviewing code for vulnerabilities.
+
+  Your expertise:
+  - OWASP Top 10 vulnerabilities
+  - Injection attacks (SQL, command, path traversal, template)
+  - Authentication and authorization patterns
+  - Secrets management and exposure risks
+  - Race conditions with security implications
+  - Event sourcing attack vectors (replay attacks, event injection)
+
+focus:
+  - Injection attacks (SQL, command, path traversal, template injection)
+  - Authentication and authorization gaps or bypasses
+  - Secrets exposure (hardcoded credentials, tokens in logs, config leaks)
+  - Input validation failures (unsanitized input, unsafe deserialization)
+  - Race conditions that could be exploited
+  - Cryptographic weaknesses (weak algorithms, improper key handling)
+  - Information disclosure through error messages or logs
+
+ignore:
+  - Code style and naming conventions
+  - Performance optimizations (unless security-related)
+  - Documentation quality
+  - General code quality or readability
+  - Test coverage
+
+severity:
+  major: "Exploitable vulnerabilities: auth bypass, injection, data exfiltration, privilege escalation, RCE"
+  minor: "Defense-in-depth issues: missing rate limiting, verbose errors, weak input validation"
+  nit: "Theoretical risks with low exploitability or impact"
@@ -0,0 +1,150 @@
+package review
+
+import (
+	"context"
+	"log/slog"
+	"strings"
+)
+
+// RepoPersonaPath is the directory path where repo-specific personas are stored.
+const RepoPersonaPath = ".review-bot/personas"
+
+// GiteaClient defines the subset of gitea.Client methods needed for loading repo personas.
+// This interface allows for easier testing and decouples the review package from gitea.
+type GiteaClient interface {
+	ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error)
+	GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error)
+}
+
+// ContentEntry represents a file or directory entry from the contents API.
+// This mirrors gitea.ContentEntry to avoid import cycles.
+type ContentEntry struct {
+	Name string `json:"name"`
+	Path string `json:"path"`
+	Type string `json:"type"` // "file" or "dir"
+}
+
+// LoadRepoPersonas fetches personas from a repository's .review-bot/personas/ directory.
+// Returns an empty map (not nil) if the directory doesn't exist or is empty.
+// Individual parse failures are logged and skipped; the remaining personas are still returned.
+// Auth errors and other non-404 errors are propagated.
+// Files exceeding MaxPersonaFileSize are rejected to prevent resource exhaustion.
+func LoadRepoPersonas(ctx context.Context, client GiteaClient, owner, repo string) (map[string]*Persona, error) {
+	result := make(map[string]*Persona)
+
+	entries, err := client.ListContents(ctx, owner, repo, RepoPersonaPath)
+	if err != nil {
+		// Check if this is a 404 (directory doesn't exist) - expected case
+		if isNotFoundError(err) {
+			slog.Debug("no repo personas directory found", "repo", owner+"/"+repo)
+			return result, nil
+		}
+		// Other errors (auth, server) should propagate
+		return nil, err
+	}
+
+	if len(entries) == 0 {
+		slog.Debug("repo personas directory is empty", "repo", owner+"/"+repo)
+		return result, nil
+	}
+
+	for _, entry := range entries {
+		if entry.Type != "file" {
+			continue
+		}
+		// Only process YAML files
+		if !isYAMLFile(entry.Name) {
+			continue
+		}
+
+		content, err := client.GetFileContent(ctx, owner, repo, entry.Path)
+		if err != nil {
+			slog.Warn("could not fetch repo persona file",
+				"file", entry.Path,
+				"repo", owner+"/"+repo,
+				"error", err)
+			continue
+		}
+
+		// Enforce size limit before parsing to prevent resource exhaustion
+		if len(content) > MaxPersonaFileSize {
+			slog.Warn("repo persona file exceeds maximum size",
+				"file", entry.Path,
+				"repo", owner+"/"+repo,
+				"size", len(content),
+				"max", MaxPersonaFileSize)
+			continue
+		}
+
+		persona, err := ParsePersonaBytes([]byte(content), entry.Path)
+		if err != nil {
+			slog.Warn("could not parse repo persona file",
+				"file", entry.Path,
+				"repo", owner+"/"+repo,
+				"error", err)
+			continue
+		}
+
+		result[persona.Name] = persona
+		slog.Debug("loaded repo persona",
+			"name", persona.Name,
+			"file", entry.Path,
+			"repo", owner+"/"+repo)
+	}
+
+	return result, nil
+}
+
+// MergePersonas combines built-in personas with repo personas.
+// Repo personas take precedence on name collision.
+// Returns a new map; inputs are not modified.
+func MergePersonas(builtin, repo map[string]*Persona) map[string]*Persona {
+	result := make(map[string]*Persona, len(builtin)+len(repo))
+
+	// Copy built-in personas first
+	for name, p := range builtin {
+		result[name] = p
+	}
+
+	// Overlay repo personas (override on collision)
+	for name, p := range repo {
+		if _, exists := result[name]; exists {
+			slog.Debug("repo persona overrides built-in", "name", name)
+		}
+		result[name] = p
+	}
+
+	return result
+}
+
+// GetBuiltinPersonasMap returns all built-in personas as a map keyed by name.
+// Returns an empty map (not nil) if loading fails.
+func GetBuiltinPersonasMap() map[string]*Persona {
+	result := make(map[string]*Persona)
+	for _, name := range ListBuiltinPersonas() {
+		p, err := LoadBuiltinPersona(name)
+		if err != nil {
+			slog.Warn("could not load built-in persona", "name", name, "error", err)
+			continue
+		}
+		result[name] = p
+	}
+	return result
+}
+
+// isYAMLFile checks if a filename has a YAML extension.
+func isYAMLFile(name string) bool {
+	lower := strings.ToLower(name)
+	return strings.HasSuffix(lower, ".yaml") || strings.HasSuffix(lower, ".yml")
+}
+
+// isNotFoundError checks if an error represents a 404 response.
+// This uses a specific "HTTP 404" substring match rather than a generic "not found"
+// match to avoid masking authentication failures or transport errors that might
+// contain "not found" in their message.
+func isNotFoundError(err error) bool {
+	if err == nil {
+		return false
+	}
+	return strings.Contains(err.Error(), "HTTP 404")
+}
@@ -0,0 +1,443 @@
+package review
+
+import (
+	"context"
+	"errors"
+	"strings"
+	"testing"
+)
+
+func TestParsePersonaBytes(t *testing.T) {
+	tests := []struct {
+		name       string
+		data       string
+		source     string
+		wantName   string
+		wantErr    string
+	}{
+		{
+			name: "valid yaml",
+			data: `name: test
+identity: test identity
+focus:
+  - testing
+`,
+			source:   "test.yaml",
+			wantName: "test",
+		},
+		{
+			name:    "missing name",
+			data:    "identity: test\n",
+			source:  "test.yaml",
+			wantErr: "name is required",
+		},
+		{
+			name:    "invalid yaml",
+			data:    "not: valid:\n  yaml: [broken",
+			source:  "test.yaml",
+			wantErr: "parse",
+		},
+		{
+			name: "json format by extension",
+			data: `{"name": "jsontest", "identity": "json identity"}`,
+			source:   "test.json",
+			wantName: "jsontest",
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			p, err := ParsePersonaBytes([]byte(tt.data), tt.source)
+			if tt.wantErr != "" {
+				if err == nil {
+					t.Fatalf("expected error containing %q, got nil", tt.wantErr)
+				}
+				if !strings.Contains(err.Error(), tt.wantErr) {
+					t.Errorf("error = %q, want containing %q", err.Error(), tt.wantErr)
+				}
+				return
+			}
+			if err != nil {
+				t.Fatalf("unexpected error: %v", err)
+			}
+			if p.Name != tt.wantName {
+				t.Errorf("Name = %q, want %q", p.Name, tt.wantName)
+			}
+		})
+	}
+}
+
+// mockGiteaClient implements GiteaClient for testing.
+type mockGiteaClient struct {
+	contents map[string][]ContentEntry // path -> entries
+	files    map[string]string         // path -> content
+	listErr  error
+	fileErr  map[string]error // path -> error
+}
+
+func (m *mockGiteaClient) ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error) {
+	if m.listErr != nil {
+		return nil, m.listErr
+	}
+	entries, ok := m.contents[path]
+	if !ok {
+		return nil, errors.New("list contents .review-bot/personas: HTTP 404: not found")
+	}
+	return entries, nil
+}
+
+func (m *mockGiteaClient) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
+	if m.fileErr != nil {
+		if err, ok := m.fileErr[filepath]; ok {
+			return "", err
+		}
+	}
+	content, ok := m.files[filepath]
+	if !ok {
+		return "", errors.New("HTTP 404: file not found")
+	}
+	return content, nil
+}
+
+func TestLoadRepoPersonas(t *testing.T) {
+	ctx := context.Background()
+
+	t.Run("directory not found returns empty map", func(t *testing.T) {
+		client := &mockGiteaClient{} // No contents configured -> 404
+		personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
+		if err != nil {
+			t.Fatalf("unexpected error: %v", err)
+		}
+		if personas == nil {
+			t.Error("expected empty map, got nil")
+		}
+		if len(personas) != 0 {
+			t.Errorf("expected 0 personas, got %d", len(personas))
+		}
+	})
+
+	t.Run("empty directory returns empty map", func(t *testing.T) {
+		client := &mockGiteaClient{
+			contents: map[string][]ContentEntry{
+				RepoPersonaPath: {},
+			},
+		}
+		personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
+		if err != nil {
+			t.Fatalf("unexpected error: %v", err)
+		}
+		if len(personas) != 0 {
+			t.Errorf("expected 0 personas, got %d", len(personas))
+		}
+	})
+
+	t.Run("loads valid personas", func(t *testing.T) {
+		client := &mockGiteaClient{
+			contents: map[string][]ContentEntry{
+				RepoPersonaPath: {
+					{Name: "trading.yaml", Path: ".review-bot/personas/trading.yaml", Type: "file"},
+					{Name: "crypto.yaml", Path: ".review-bot/personas/crypto.yaml", Type: "file"},
+				},
+			},
+			files: map[string]string{
+				".review-bot/personas/trading.yaml": `name: trading
+display_name: Trading Expert
+identity: You are a trading expert.
+focus:
+  - order handling
+  - risk management
+`,
+				".review-bot/personas/crypto.yaml": `name: crypto
+display_name: Crypto Expert
+identity: You are a cryptography expert.
+focus:
+  - key management
+  - encryption
+`,
+			},
+		}
+		personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
+		if err != nil {
+			t.Fatalf("unexpected error: %v", err)
+		}
+		if len(personas) != 2 {
+			t.Fatalf("expected 2 personas, got %d", len(personas))
+		}
+		if personas["trading"] == nil {
+			t.Error("expected trading persona")
+		}
+		if personas["crypto"] == nil {
+			t.Error("expected crypto persona")
+		}
+		if personas["trading"].DisplayName != "Trading Expert" {
+			t.Errorf("trading display name = %q, want %q", personas["trading"].DisplayName, "Trading Expert")
+		}
+	})
+
+	t.Run("skips invalid persona files", func(t *testing.T) {
+		client := &mockGiteaClient{
+			contents: map[string][]ContentEntry{
+				RepoPersonaPath: {
+					{Name: "valid.yaml", Path: ".review-bot/personas/valid.yaml", Type: "file"},
+					{Name: "invalid.yaml", Path: ".review-bot/personas/invalid.yaml", Type: "file"},
+				},
+			},
+			files: map[string]string{
+				".review-bot/personas/valid.yaml": `name: valid
+identity: Valid persona
+`,
+				".review-bot/personas/invalid.yaml": "not valid yaml: [broken",
+			},
+		}
+		personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
+		if err != nil {
+			t.Fatalf("unexpected error: %v", err)
+		}
+		// Should have the valid one, skip the invalid
+		if len(personas) != 1 {
+			t.Fatalf("expected 1 persona (skipped invalid), got %d", len(personas))
+		}
+		if personas["valid"] == nil {
+			t.Error("expected valid persona")
+		}
+	})
+
+	t.Run("skips non-yaml files", func(t *testing.T) {
+		client := &mockGiteaClient{
+			contents: map[string][]ContentEntry{
+				RepoPersonaPath: {
+					{Name: "persona.yaml", Path: ".review-bot/personas/persona.yaml", Type: "file"},
+					{Name: "README.md", Path: ".review-bot/personas/README.md", Type: "file"},
+					{Name: "notes.txt", Path: ".review-bot/personas/notes.txt", Type: "file"},
+				},
+			},
+			files: map[string]string{
+				".review-bot/personas/persona.yaml": `name: test
+identity: Test persona
+`,
+				".review-bot/personas/README.md": "# Personas\n\nPut your personas here.",
+			},
+		}
+		personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
+		if err != nil {
+			t.Fatalf("unexpected error: %v", err)
+		}
+		if len(personas) != 1 {
+			t.Fatalf("expected 1 persona (yaml only), got %d", len(personas))
+		}
+	})
+
+	t.Run("skips subdirectories", func(t *testing.T) {
+		client := &mockGiteaClient{
+			contents: map[string][]ContentEntry{
+				RepoPersonaPath: {
+					{Name: "persona.yaml", Path: ".review-bot/personas/persona.yaml", Type: "file"},
+					{Name: "subdir", Path: ".review-bot/personas/subdir", Type: "dir"},
+				},
+			},
+			files: map[string]string{
+				".review-bot/personas/persona.yaml": `name: test
+identity: Test persona
+`,
+			},
+		}
+		personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
+		if err != nil {
+			t.Fatalf("unexpected error: %v", err)
+		}
+		if len(personas) != 1 {
+			t.Fatalf("expected 1 persona (files only), got %d", len(personas))
+		}
+	})
+
+	t.Run("propagates auth errors", func(t *testing.T) {
+		client := &mockGiteaClient{
+			listErr: errors.New("HTTP 401: unauthorized"),
+		}
+		_, err := LoadRepoPersonas(ctx, client, "owner", "repo")
+		if err == nil {
+			t.Fatal("expected error for auth failure")
+		}
+		if !strings.Contains(err.Error(), "401") {
+			t.Errorf("error = %q, want containing '401'", err.Error())
+		}
+	})
+
+	t.Run("skips files that fail to fetch", func(t *testing.T) {
+		client := &mockGiteaClient{
+			contents: map[string][]ContentEntry{
+				RepoPersonaPath: {
+					{Name: "good.yaml", Path: ".review-bot/personas/good.yaml", Type: "file"},
+					{Name: "bad.yaml", Path: ".review-bot/personas/bad.yaml", Type: "file"},
+				},
+			},
+			files: map[string]string{
+				".review-bot/personas/good.yaml": `name: good
+identity: Good persona
+`,
+			},
+			fileErr: map[string]error{
+				".review-bot/personas/bad.yaml": errors.New("HTTP 500: internal server error"),
+			},
+		}
+		personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
+		if err != nil {
+			t.Fatalf("unexpected error: %v", err)
+		}
+		if len(personas) != 1 {
+			t.Fatalf("expected 1 persona (skipped failed fetch), got %d", len(personas))
+		}
+	})
+
+	t.Run("skips oversized files", func(t *testing.T) {
+		// Create a content string that exceeds MaxPersonaFileSize (64KB)
+		oversizedContent := strings.Repeat("a", MaxPersonaFileSize+1)
+		client := &mockGiteaClient{
+			contents: map[string][]ContentEntry{
+				RepoPersonaPath: {
+					{Name: "normal.yaml", Path: ".review-bot/personas/normal.yaml", Type: "file"},
+					{Name: "huge.yaml", Path: ".review-bot/personas/huge.yaml", Type: "file"},
+				},
+			},
+			files: map[string]string{
+				".review-bot/personas/normal.yaml": `name: normal
+identity: Normal sized persona
+`,
+				".review-bot/personas/huge.yaml": oversizedContent,
+			},
+		}
+		personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
+		if err != nil {
+			t.Fatalf("unexpected error: %v", err)
+		}
+		// Should have the normal one, skip the oversized
+		if len(personas) != 1 {
+			t.Fatalf("expected 1 persona (skipped oversized), got %d", len(personas))
+		}
+		if personas["normal"] == nil {
+			t.Error("expected normal persona")
+		}
+	})
+}
+
+func TestMergePersonas(t *testing.T) {
+	builtin := map[string]*Persona{
+		"security": {Name: "security", Identity: "Built-in security"},
+		"docs":     {Name: "docs", Identity: "Built-in docs"},
+	}
+	repo := map[string]*Persona{
+		"security": {Name: "security", Identity: "Repo security override"},
+		"trading":  {Name: "trading", Identity: "Repo trading"},
+	}
+
+	merged := MergePersonas(builtin, repo)
+
+	t.Run("repo overrides builtin on collision", func(t *testing.T) {
+		if merged["security"].Identity != "Repo security override" {
+			t.Errorf("security identity = %q, want repo override", merged["security"].Identity)
+		}
+	})
+
+	t.Run("builtin preserved when no collision", func(t *testing.T) {
+		if merged["docs"].Identity != "Built-in docs" {
+			t.Errorf("docs identity = %q, want built-in", merged["docs"].Identity)
+		}
+	})
+
+	t.Run("repo-only persona added", func(t *testing.T) {
+		if merged["trading"] == nil {
+			t.Error("expected trading persona from repo")
+		}
+		if merged["trading"].Identity != "Repo trading" {
+			t.Errorf("trading identity = %q, want repo", merged["trading"].Identity)
+		}
+	})
+
+	t.Run("original maps not modified", func(t *testing.T) {
+		if builtin["trading"] != nil {
+			t.Error("builtin map was modified")
+		}
+		if len(repo) != 2 {
+			t.Error("repo map was modified")
+		}
+	})
+}
+
+func TestGetBuiltinPersonasMap(t *testing.T) {
+	personas := GetBuiltinPersonasMap()
+
+	if len(personas) == 0 {
+		t.Fatal("expected at least one built-in persona")
+	}
+
+	// Verify expected personas exist
+	expected := []string{"security", "architect", "docs"}
+	for _, name := range expected {
+		if personas[name] == nil {
+			t.Errorf("expected built-in persona %q", name)
+		}
+	}
+
+	// Verify personas are valid
+	for name, p := range personas {
+		if p.Name != name {
+			t.Errorf("persona %q has mismatched name %q", name, p.Name)
+		}
+		if p.Identity == "" {
+			t.Errorf("persona %q has empty identity", name)
+		}
+	}
+}
+
+func TestIsYAMLFile(t *testing.T) {
+	tests := []struct {
+		name string
+		want bool
+	}{
+		{"test.yaml", true},
+		{"test.yml", true},
+		{"test.YAML", true},
+		{"test.YML", true},
+		{"test.json", false},
+		{"test.md", false},
+		{"test.txt", false},
+		{"yaml", false},
+		{"yaml.md", false},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			if got := isYAMLFile(tt.name); got != tt.want {
+				t.Errorf("isYAMLFile(%q) = %v, want %v", tt.name, got, tt.want)
+			}
+		})
+	}
+}
+
+func TestIsNotFoundError(t *testing.T) {
+	tests := []struct {
+		err  error
+		want bool
+	}{
+		{nil, false},
+		{errors.New("HTTP 404: not found"), true},
+		{errors.New("HTTP 404"), true},
+		// Intentionally false: generic "not found" could mask auth/transport errors.
+		// Only explicit HTTP 404 responses should be treated as "directory doesn't exist".
+		{errors.New("something not found"), false},
+		{errors.New("HTTP 401: unauthorized"), false},
+		{errors.New("connection refused"), false},
+	}
+
+	for _, tt := range tests {
+		name := "nil"
+		if tt.err != nil {
+			name = tt.err.Error()
+		}
+		t.Run(name, func(t *testing.T) {
+			if got := isNotFoundError(tt.err); got != tt.want {
+				t.Errorf("isNotFoundError(%v) = %v, want %v", tt.err, got, tt.want)
+			}
+		})
+	}
+}
@@ -1,61 +1,127 @@
-#!/bin/bash
+#!/usr/bin/env bash
 # check-deps.sh - Enforces the strict dependency allowlist from CONVENTIONS.md
 # Exit 1 if any unapproved import is found.
+#
+# Requires: Bash 4+ (for associative arrays), Go toolchain
+# 
+# The allowlist is parsed from CONVENTIONS.md to maintain a single source of truth.
+# Enforces Scope column: "test only" packages cannot appear in non-test code.

 set -euo pipefail

-# Approved third-party packages (from CONVENTIONS.md)
-ALLOWED=(
-    "gopkg.in/yaml.v3"
-    "github.com/google/go-cmp"
-)
+# Check bash version
+if ((BASH_VERSINFO[0] < 4)); then
+    echo "❌ Bash 4+ required (found ${BASH_VERSION})"
+    echo "   On macOS: brew install bash"
+    exit 1
+fi

-# Build regex pattern from allowed list
-ALLOWED_PATTERN=""
-for pkg in "${ALLOWED[@]}"; do
-    if [ -z "$ALLOWED_PATTERN" ]; then
-        ALLOWED_PATTERN="$pkg"
-    else
-        ALLOWED_PATTERN="$ALLOWED_PATTERN|$pkg"
+CONVENTIONS_FILE="${1:-CONVENTIONS.md}"
+
+if [ ! -f "$CONVENTIONS_FILE" ]; then
+    echo "❌ CONVENTIONS.md not found"
+    exit 1
+fi
+
+# Parse approved packages from CONVENTIONS.md table using awk (POSIX-compatible)
+# Format: | `package` | use case | scope |
+declare -A ALLOWED_PROD=()
+declare -A ALLOWED_TEST=()
+
+while IFS= read -r line; do
+    # Use awk to extract package and scope from table row
+    pkg=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]*`|`[[:space:]]*$/, "", $2); print $2}')
+    scope=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]+|[[:space:]]+$/, "", $4); print tolower($4)}')
+    
+    if [ -n "$pkg" ] && [ "$pkg" != "Package" ] && [[ "$pkg" =~ ^[a-zA-Z] ]]; then
+        if [[ "$scope" == *"test"* ]]; then
+            ALLOWED_TEST["$pkg"]=1
+        else
+            ALLOWED_PROD["$pkg"]=1
+        fi
    fi
-done
+done < <(grep '| `' "$CONVENTIONS_FILE" 2>/dev/null || true)

-# Get all imports from go.mod (excluding the module itself and stdlib)
-IMPORTS=$(go list -m all 2>/dev/null | tail -n +2 | awk '{print $1}' || true)
+ALL_ALLOWED=("${!ALLOWED_PROD[@]}" "${!ALLOWED_TEST[@]}")

-if [ -z "$IMPORTS" ]; then
+if [ ${#ALL_ALLOWED[@]} -eq 0 ]; then
+    echo "⚠️  No approved packages found in $CONVENTIONS_FILE"
+    echo "   (This is fine if you want stdlib-only)"
+fi
+
+# Helper: check if import matches any package in an associative array (literal prefix, no glob)
+matches_allowlist() {
+    local import="$1"
+    shift
+    local -n allowlist=$1
+    
+    for allowed in "${!allowlist[@]}"; do
+        # Exact match
+        if [ "$import" = "$allowed" ]; then
+            return 0
+        fi
+        # Literal prefix match for subpackages: must match "pkg/" exactly
+        if [ "${import#"$allowed/"}" != "$import" ]; then
+            return 0
+        fi
+    done
+    return 1
+}
+
+# Get direct module dependencies from go.mod
+DIRECT_IMPORTS=$(go list -m -f '{{if and (not .Indirect) (not .Main)}}{{.Path}}{{end}}' all 2>&1) || {
+    echo "❌ Failed to list dependencies: $DIRECT_IMPORTS"
+    exit 1
+}
+DIRECT_IMPORTS=$(echo "$DIRECT_IMPORTS" | grep -v '^$' || true)
+
+if [ -z "$DIRECT_IMPORTS" ]; then
    echo "✅ No external dependencies"
    exit 0
 fi

+# Check ALL direct dependencies are in some allowlist
 VIOLATIONS=""
 while IFS= read -r import; do
-    # Skip empty lines
    [ -z "$import" ] && continue
    
-    # Check if import matches any allowed pattern (prefix match for subpackages)
-    MATCHED=false
-    for allowed in "${ALLOWED[@]}"; do
-        if [[ "$import" == "$allowed" ]] || [[ "$import" == "$allowed/"* ]]; then
-            MATCHED=true
-            break
-        fi
-    done
-    
-    if [ "$MATCHED" = false ]; then
-        VIOLATIONS="$VIOLATIONS\n  - $import"
+    if ! matches_allowlist "$import" ALLOWED_PROD && ! matches_allowlist "$import" ALLOWED_TEST; then
+        VIOLATIONS="${VIOLATIONS}  - ${import} (not in allowlist)"$'\n'
    fi
-done <<< "$IMPORTS"
+done <<< "$DIRECT_IMPORTS"

 if [ -n "$VIOLATIONS" ]; then
    echo "❌ UNAPPROVED DEPENDENCIES DETECTED"
-    echo -e "The following imports are not in the allowlist:$VIOLATIONS"
    echo ""
-    echo "To add a dependency:"
-    echo "  1. Open a PR that ONLY updates CONVENTIONS.md"
-    echo "  2. Get explicit approval from Aaron"
-    echo "  3. After merge, use the package in a separate PR"
+    echo "The following imports are not in the allowlist:"
+    printf "%s" "$VIOLATIONS"
+    echo ""
+    echo "To add a dependency, update CONVENTIONS.md (requires Aaron's approval)"
+    exit 1
+fi
+
+# Enforce Scope: test-only packages must not appear in non-test code
+# Get imports used by non-test code only (go list -deps without -test excludes test deps)
+PROD_IMPORTS=$(go list -deps -f '{{if not .Standard}}{{.ImportPath}}{{end}}' ./... 2>/dev/null || true)
+
+TEST_ONLY_IN_PROD=""
+for test_pkg in "${!ALLOWED_TEST[@]}"; do
+    # Use word-boundary matching: exact match or followed by /
+    if echo "$PROD_IMPORTS" | grep -qE "^${test_pkg}(/|\$|$)"; then
+        TEST_ONLY_IN_PROD="${TEST_ONLY_IN_PROD}  - ${test_pkg} (marked 'test only' but used in production code)"$'\n'
+    fi
+done
+
+if [ -n "$TEST_ONLY_IN_PROD" ]; then
+    echo "❌ TEST-ONLY DEPENDENCIES IN PRODUCTION CODE"
+    echo ""
+    printf "%s" "$TEST_ONLY_IN_PROD"
+    echo ""
+    echo "These packages are marked 'test only' in CONVENTIONS.md"
+    echo "and must only be imported from *_test.go files."
    exit 1
 fi

 echo "✅ All dependencies are approved"
+echo "   Direct module deps: $(echo "$DIRECT_IMPORTS" | wc -l | tr -d ' ')"
+echo "   Production allowlist: ${#ALLOWED_PROD[@]}, Test-only allowlist: ${#ALLOWED_TEST[@]}"
Author	SHA1	Message	Date
Rodin	1dd73bc4df	Revert "ci: disable setup-go cache (cache server unreachable)" PR Ready Gate / clear-labels (pull_request) Successful in 3s Details CI / test (pull_request) Successful in 9m32s Details CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Failing after 5m21s Details CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Failing after 5m51s Details CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Failing after 5m58s Details This reverts commit `8f564ea4f8`.	2026-05-10 19:44:08 -07:00
Rodin	8f564ea4f8	ci: disable setup-go cache (cache server unreachable) PR Ready Gate / clear-labels (pull_request) Successful in 2s Details CI / test (pull_request) Successful in 15s Details CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 36s Details CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m28s Details CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m43s Details Cache server at 192.168.109.55:35239 times out, adding 4+ minutes to each job. Disable until cache infra is fixed.	2026-05-10 19:43:46 -07:00
Rodin	9775cb098c	fix: address PR #61 review findings PR Ready Gate / clear-labels (pull_request) Successful in 1s Details CI / test (pull_request) Successful in 9m32s Details CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 9m55s Details CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 10m38s Details CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 11m3s Details MAJOR: - LoadRepoPersonas: add MaxPersonaFileSize check before parsing to prevent resource exhaustion from oversized YAML files committed to target repositories MINOR: - isNotFoundError: tighten substring match to 'HTTP 404' only to avoid masking auth/transport errors containing generic 'not found' - main.go: remove duplicate flag.Parse() call - main.go: add comment explaining nil map indexing is safe in Go when LoadRepoPersonas returns an error Tests updated to reflect the intentional behavior change in isNotFoundError and added test case for oversized file rejection.	2026-05-10 19:29:13 -07:00
Rodin	3f06ba2ea6	feat: load personas from target repo .review-bot/personas/ PR Ready Gate / clear-labels (pull_request) Successful in 2s Details CI / test (pull_request) Successful in 9m32s Details CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 10m10s Details CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 10m51s Details CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 10m33s Details Implements #60. - Add ParsePersonaBytes() for parsing personas from byte data - Add LoadRepoPersonas() to fetch personas from repo via Gitea API - Add MergePersonas() to combine built-in and repo personas - Add GetBuiltinPersonasMap() helper - Update main.go to load repo personas first, fall back to built-in - Add giteaClientAdapter to bridge gitea.Client to review.GiteaClient When --persona is specified, the bot now: 1. Attempts to fetch personas from .review-bot/personas/*.yaml 2. If the named persona exists in the repo, uses it 3. Otherwise falls back to built-in personas This allows repos to define domain-specific personas (e.g., trading experts for gargoyle, crypto experts for kms-lite) without modifying the review-bot codebase.	2026-05-10 19:05:37 -07:00
aweiker	593b249e09	Merge pull request 'feat: add YAML support for persona files' (#58 ) from issue-57 into main CI / test (push) Successful in 9m31s Details CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped Details CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped Details CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped Details Reviewed-on: #58 Reviewed-by: security-review-bot <[email protected]> Reviewed-by: Aaron Weiker <[email protected]>	2026-05-11 01:39:43 +00:00
Rodin	10cd6203d4	fix: address remaining PR #58 review findings PR Ready Gate / clear-labels (pull_request) Successful in 2s Details CI / test (pull_request) Successful in 9m31s Details CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 9m54s Details CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 10m40s Details CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 11m27s Details 1. Remove dead JSON fallback in LoadBuiltinPersona - The embed directive only includes *.yaml files - JSON fallback code could never succeed - Simplified function to only try YAML 2. JSON parsing now rejects unknown fields - Switched from json.Unmarshal to json.Decoder - DisallowUnknownFields() matches YAML's KnownFields(true) - Added test coverage for JSON unknown field rejection 3. Documented symlink support in LoadPersona - os.Stat follows symlinks, so symlinks to regular files work - Added doc comment explaining the behavior - Added test for symlink support	2026-05-10 17:53:42 -07:00
Aaron Weiker	26f326cf51	fix: add YAML alias cycle detection and multi-document rejection PR Ready Gate / clear-labels (pull_request) Successful in 2s Details CI / test (pull_request) Successful in 9m34s Details CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 9m53s Details CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 10m23s Details CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 11m24s Details Address security review findings: MAJOR: Add cycle detection to checkYAMLDepth using a visited set (seen map[*yaml.Node]struct{}) to prevent infinite recursion from crafted YAML with self-referential aliases. MINOR fixes: - Add MaxYAMLNodes (1000) limit as defense-in-depth against wide-but-shallow structures that bypass depth limits - Increment depth when following alias targets (was incorrectly passing same depth, allowing alias chains to bypass depth limit) - Reject multi-document YAML files instead of silently ignoring additional documents (prevents confusing silent data loss) Tests added: - TestYAMLAliasCycleDetection: Direct test of cycle detection logic - TestYAMLMultiDocumentRejection: Verifies multi-doc files rejected - TestYAMLNodeCountLimit: Verifies wide structures are rejected - TestCheckYAMLDepthCycleDetectionDirect: Unit test with artificial cycle	2026-05-10 17:12:01 -07:00
Rodin	4fed59ac85	yaml: enable strict field checking to catch typos PR Ready Gate / clear-labels (pull_request) Successful in 2s Details CI / test (pull_request) Successful in 9m33s Details CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 9m54s Details CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 10m51s Details CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 11m30s Details Addresses PR #58 MINOR finding: YAML decoder now rejects unknown fields. - Enable KnownFields(true) on YAML decoder to catch typos like 'focuss' or 'identiy' in persona files - Since yaml.Node.Decode() doesn't support KnownFields, we now do a two-pass decode: first pass checks depth limits, second pass decodes with strict field checking - Add tests for unknown field rejection at top-level and nested levels	2026-05-10 16:50:07 -07:00
Rodin	6035afeea7	fix: address MINOR review findings from `c3e8f0f` review PR Ready Gate / clear-labels (pull_request) Successful in 2s Details CI / test (pull_request) Successful in 9m33s Details CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 9m51s Details CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 11m13s Details CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 11m25s Details	2026-05-10 16:29:44 -07:00
Rodin	c3e8f0f231	fix: address PR review findings PR Ready Gate / clear-labels (pull_request) Successful in 2s Details CI / test (pull_request) Successful in 9m32s Details CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 9m53s Details CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 10m52s Details CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 11m0s Details MAJOR fixes: - Remove false security claim about gopkg.in/yaml.v3 having built-in depth protection - Add explicit YAML depth limiting via yaml.Node API (MaxYAMLDepth=20) - Add file size limit for persona files (MaxPersonaFileSize=64KB) - Add test for deeply nested YAML rejection MINOR fixes: - Add sort.Strings to ListBuiltinPersonas for deterministic ordering - Update design doc to reflect actual library used (gopkg.in/yaml.v3) - Update README: 'Zero dependencies' → 'Minimal dependencies' - Add test for file size limit - Add test for sorted persona list	2026-05-10 14:43:31 -07:00
RodinandRodin	7898dd939f	feat: add YAML support for persona files (#57 ) PR Ready Gate / clear-labels (pull_request) Successful in 1s Details CI / test (pull_request) Successful in 9m33s Details CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 9m55s Details CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 10m32s Details CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 11m0s Details - Add gopkg.in/yaml.v3 dependency (approved in CONVENTIONS.md) - Update parsePersona to detect format by file extension - Support both .yaml and .yml extensions (case-insensitive) - Convert built-in personas to YAML format - Add comprehensive tests for YAML parsing - Update README with YAML examples and documentation YAML provides cleaner multi-line strings via literal block scalars and supports comments, making persona definitions more readable. JSON remains supported for backwards compatibility. Closes #57	2026-05-10 14:16:41 -07:00
rodin	fededd18ad	Merge pull request 'docs: allow approved third-party packages' (#59 ) from allow-deps into main CI / test (push) Successful in 15s Details CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped Details CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped Details CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped Details docs: strict dependency allowlist with CI enforcement	2026-05-10 21:07:10 +00:00
Rodin	01cde16d47	fix: validate all deps and improve robustness PR Ready Gate / clear-labels (pull_request) Successful in 1s Details CI / test (pull_request) Successful in 15s Details CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 34s Details CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m51s Details CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m51s Details Addresses GPT review feedback: 1. MAJOR - Test deps now validated: All direct module deps (from go.mod) are checked against the allowlist, whether used in prod or tests. 2. MINOR - Prefix match: Uses grep -E with word boundary (^pkg(/\|$\|$)) to avoid false positives on similarly-prefixed modules. 3. MINOR - Bash version check: Script now fails early with helpful message if Bash < 4 (macOS default). Added shebang: #!/usr/bin/env bash 4. NIT - Removed redundant grep -v '_test' (go list -deps already excludes test-only deps without -test flag).	2026-05-10 14:02:06 -07:00
Rodin	aeb0c8cb79	fix: enforce Scope column and improve portability PR Ready Gate / clear-labels (pull_request) Successful in 2s Details CI / test (pull_request) Successful in 14s Details CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m2s Details CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 35s Details CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m46s Details Addresses review feedback: 1. MAJOR - Scope enforcement: Script now parses the Scope column and ensures 'test only' packages don't appear in non-test code. Uses 'go list -deps' to check production imports. 2. MINOR - Portability: Replaced 'grep -P' (GNU-only) with awk-based parsing that works on macOS/BSD. 3. MINOR - Robustness: Table parsing uses awk to split on '\|' and extract columns properly, handling whitespace variations. 4. MINOR - Glob safety: Prefix matching now uses parameter expansion instead of glob patterns to prevent metacharacter issues.	2026-05-10 13:57:49 -07:00
Rodin	70267b68f4	fix: address review feedback on dependency allowlist PR Ready Gate / clear-labels (pull_request) Successful in 2s Details CI / test (pull_request) Successful in 14s Details CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 32s Details CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m27s Details CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m28s Details Fixes: - Single source of truth: script now parses allowlist from CONVENTIONS.md - Fail closed: script exits non-zero if 'go list' fails - Direct deps only: uses '-f' flag to exclude transitive deps - Added 'precommit' to .PHONY in Makefile - Removed unused ALLOWED_PATTERN variable - Added Scope column to distinguish test-only vs production deps - Clarified that transitive deps of approved packages are allowed - Added note that enforcement script parses the table	2026-05-10 13:53:55 -07:00