Compare commits

...

6 Commits

Author SHA1 Message Date
Rodin 489457c184 ci: retrigger after LLM_BASE_URL secret fix
CI / test (pull_request) Successful in 13s
CI / review (gpt-4.1, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 20s
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 1m27s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m52s
2026-05-04 23:15:08 -07:00
Rodin 25d1a670bf fix: redesign repairJSON to handle all reviewer-reported edge cases
CI / test (pull_request) Successful in 11s
CI / review (gpt-4.1, gpt, GPT_REVIEW_TOKEN) (pull_request) Failing after 10s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Failing after 10s
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Failing after 11s
Rewrites the JSON repair algorithm to address two correctness bugs
identified in code review:

1. Interior quoted word before comma: "say "yes", and go" was
   misidentified as structural close because "," followed the quote.

2. JSON-shaped content in strings: {"key": "val"} inside a string
   value was being parsed as actual JSON structure.

The new approach:
- Distinguishes keys from values (only values need repair)
- Uses first-valid-candidate scan with deep lookahead
- Verifies that after a candidate close, the continuation is not just
  a structural char but a complete valid JSON pattern
- Validates keyword tokens (true/false/null) fully, not just first char
- Checks container closes recursively for valid continuation

Adds comprehensive tests for all reported edge cases plus a complex
combined scenario with nested JSON-like content, quoted words before
commas, and multiple failure modes in one string.
2026-05-04 21:27:39 -07:00
Rodin 80a9a7675b fix: repair unescaped quotes in LLM JSON responses
CI / test (pull_request) Successful in 13s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Failing after 13s
CI / review (gpt-4.1, gpt, GPT_REVIEW_TOKEN) (pull_request) Failing after 13s
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Failing after 12s
LLMs (especially Sonnet) sometimes emit JSON with unescaped double
quotes inside string values, e.g. (e.g. "28") instead of properly
escaping them. This caused parse failures in CI.

Add a repairJSON fallback that uses a character-by-character scanner
to identify interior quotes (those not followed by structural JSON
characters) and escape them before retrying the parse.

Fixes sonnet-review failures on gargoyle PR #551.
2026-05-03 09:47:22 -07:00
rodin 8d8a249481 Merge pull request 'fix: supersede ALL old reviews, not just the most recent' (#43) from fix/supersede-all-old-reviews into main
CI / test (push) Successful in 13s
CI / review (gpt-4.1, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
Release / release (push) Successful in 31s
2026-05-02 20:35:23 +00:00
Rodin a0fd882b0d fix: address review findings
CI / test (pull_request) Successful in 14s
CI / review (gpt-4.1, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 24s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 37s
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 1m4s
- Tighten timeline matching: also check ev.User.Login matches
  the review author (prevents collision on identical body prefix)
- Remove unused sharedTokenMode variable (inline condition)
- Aggregate resolution failures with warn-level summary
2026-05-02 13:31:59 -07:00
Rodin d4bf13eeab fix: supersede ALL old reviews, not just the most recent
CI / test (pull_request) Successful in 14s
CI / review (gpt-4.1, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 22s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 46s
CI / review (gpt-5, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 1m7s
Previously findOwnReview returned only the single most-recent matching
review, so on PRs with multiple force-pushes only the latest old review
got superseded. The rest accumulated as unsuperseded stale reviews.

Changes:
- Add findAllOwnReviews() to collect all non-superseded matching reviews
- Loop over all old reviews in the supersede phase
- Add GetTimelineReviewCommentIDForReview() to find comment IDs by
  review ID (fetches review body, matches in timeline by prefix)
- Each old review gets independently superseded and its inline comments
  resolved

The old findOwnReview is kept for backward compat (tested, may be
useful as a utility).
2026-05-02 13:28:03 -07:00
5 changed files with 481 additions and 51 deletions
+42 -37
View File
@@ -319,27 +319,16 @@ func main() {
// 1. POST new review first (gets non-stale approval badge on HEAD) // 1. POST new review first (gets non-stale approval badge on HEAD)
// 2. Then supersede old review with link to the new one // 2. Then supersede old review with link to the new one
// Order matters: post first so we have the new review's URL for the supersede message. // Order matters: post first so we have the new review's URL for the supersede message.
var existingReview *gitea.Review var oldReviews []gitea.Review
var existingCommentID int64
if *reviewerName != "" { if *reviewerName != "" {
existingReviews, err := giteaClient.ListReviews(ctx, owner, repoName, prNumber) existingReviews, err := giteaClient.ListReviews(ctx, owner, repoName, prNumber)
if err != nil { if err != nil {
slog.Warn("could not list existing reviews", "pr", prNumber, "error", err) slog.Warn("could not list existing reviews", "pr", prNumber, "error", err)
} else { } else {
sharedToken := hasSharedToken(existingReviews, sentinel) if hasSharedToken(existingReviews, sentinel) {
if !sharedToken {
existingReview = findOwnReview(existingReviews, sentinel)
if existingReview != nil {
cid, err := giteaClient.GetTimelineReviewCommentID(ctx, owner, repoName, prNumber, sentinel)
if err != nil {
slog.Warn("could not find old review comment ID for supersede", "error", err)
existingReview = nil // can't supersede without comment ID
} else {
existingCommentID = cid
}
}
} else {
slog.Warn("shared token mode: skipping supersede to avoid clobbering sibling review") slog.Warn("shared token mode: skipping supersede to avoid clobbering sibling review")
} else {
oldReviews = findAllOwnReviews(existingReviews, sentinel)
} }
} }
} }
@@ -365,24 +354,28 @@ func main() {
} }
slog.Info("review posted", "review_id", posted.ID, "user", posted.User.Login, "pr", prNumber) slog.Info("review posted", "review_id", posted.ID, "user", posted.User.Login, "pr", prNumber)
// Supersede old review with link to the new one // Supersede all old reviews with link to the new one
if existingReview != nil && existingCommentID > 0 { if len(oldReviews) > 0 {
newReviewURL := fmt.Sprintf("%s/%s/%s/pulls/%d#pullrequestreview-%d", strings.TrimRight(*giteaURL, "/"), owner, repoName, prNumber, posted.ID) newReviewURL := fmt.Sprintf("%s/%s/%s/pulls/%d#pullrequestreview-%d", strings.TrimRight(*giteaURL, "/"), owner, repoName, prNumber, posted.ID)
supersededBody := buildSupersededBody(existingReview.Body, existingReview.CommitID, newReviewURL, sentinel) for _, oldReview := range oldReviews {
supersedeOK := false cid, err := giteaClient.GetTimelineReviewCommentIDForReview(ctx, owner, repoName, prNumber, oldReview.ID)
if err := giteaClient.EditComment(ctx, owner, repoName, existingCommentID, supersededBody); err != nil {
slog.Warn("could not mark old review as superseded", "comment_id", existingCommentID, "error", err)
} else {
slog.Info("marked old review as superseded", "old_state", existingReview.State, "new_review_id", posted.ID, "pr", prNumber)
supersedeOK = true
}
// Resolve old review's inline comments only after successful supersede
if supersedeOK {
oldComments, err := giteaClient.ListReviewComments(ctx, owner, repoName, prNumber, existingReview.ID)
if err != nil { if err != nil {
slog.Warn("could not list old review comments for resolution", "review_id", existingReview.ID, "error", err) slog.Warn("could not find comment ID for old review", "review_id", oldReview.ID, "error", err)
} else { continue
}
supersededBody := buildSupersededBody(oldReview.Body, oldReview.CommitID, newReviewURL, sentinel)
if err := giteaClient.EditComment(ctx, owner, repoName, cid, supersededBody); err != nil {
slog.Warn("could not mark old review as superseded", "review_id", oldReview.ID, "comment_id", cid, "error", err)
continue
}
slog.Info("marked old review as superseded", "review_id", oldReview.ID, "new_review_id", posted.ID, "pr", prNumber)
// Resolve old review's inline comments
oldComments, err := giteaClient.ListReviewComments(ctx, owner, repoName, prNumber, oldReview.ID)
if err != nil {
slog.Warn("could not list old review comments for resolution", "review_id", oldReview.ID, "error", err)
continue
}
resolved, failed := 0, 0 resolved, failed := 0, 0
for _, c := range oldComments { for _, c := range oldComments {
if c.ID == 0 { if c.ID == 0 {
@@ -396,11 +389,10 @@ func main() {
} }
} }
if resolved > 0 { if resolved > 0 {
slog.Info("resolved old inline comments", "count", resolved, "pr", prNumber) slog.Info("resolved old inline comments", "review_id", oldReview.ID, "count", resolved, "pr", prNumber)
} }
if failed > 0 { if failed > 0 {
slog.Warn("some inline comments could not be resolved", "failed", failed, "pr", prNumber) slog.Warn("some inline comments could not be resolved", "review_id", oldReview.ID, "failed", failed, "pr", prNumber)
}
} }
} }
} }
@@ -627,21 +619,34 @@ func extractSentinelName(body string) string {
return rest[:end] return rest[:end]
} }
// findOwnReview locates a review matching the given sentinel in its body. // findOwnReview locates the most recent non-superseded review matching the sentinel.
func findOwnReview(reviews []gitea.Review, sentinel string) *gitea.Review { func findOwnReview(reviews []gitea.Review, sentinel string) *gitea.Review {
var best *gitea.Review var best *gitea.Review
for i := range reviews { for i := range reviews {
if !strings.Contains(reviews[i].Body, sentinel) { if !strings.Contains(reviews[i].Body, sentinel) {
continue continue
} }
// Skip superseded reviews (they contain our sentinel in the collapsed body)
if strings.Contains(reviews[i].Body, "~~Original review~~") { if strings.Contains(reviews[i].Body, "~~Original review~~") {
continue continue
} }
// Take the highest ID (most recent)
if best == nil || reviews[i].ID > best.ID { if best == nil || reviews[i].ID > best.ID {
best = &reviews[i] best = &reviews[i]
} }
} }
return best return best
} }
// findAllOwnReviews returns all non-superseded reviews matching the sentinel.
func findAllOwnReviews(reviews []gitea.Review, sentinel string) []gitea.Review {
var result []gitea.Review
for i := range reviews {
if !strings.Contains(reviews[i].Body, sentinel) {
continue
}
if strings.Contains(reviews[i].Body, "~~Original review~~") {
continue
}
result = append(result, reviews[i])
}
return result
}
+21
View File
@@ -841,3 +841,24 @@ func cleanEnv() []string {
} }
return env return env
} }
func TestFindAllOwnReviews(t *testing.T) {
reviews := []gitea.Review{
{ID: 1, Body: "<!-- review-bot:sonnet -->\nfirst review"},
{ID: 2, Body: "<!-- review-bot:gpt -->\nother bot"},
{ID: 3, Body: "<!-- review-bot:sonnet -->\nsecond review"},
{ID: 4, Body: "~~Original review~~\n<!-- review-bot:sonnet -->\nsuperseded"},
{ID: 5, Body: "<!-- review-bot:sonnet -->\nthird review"},
}
got := findAllOwnReviews(reviews, "<!-- review-bot:sonnet -->")
if len(got) != 3 {
t.Fatalf("findAllOwnReviews() returned %d, want 3", len(got))
}
wantIDs := []int64{1, 3, 5}
for i, r := range got {
if r.ID != wantIDs[i] {
t.Errorf("got[%d].ID = %d, want %d", i, r.ID, wantIDs[i])
}
}
}
+62
View File
@@ -426,6 +426,68 @@ func (c *Client) GetTimelineReviewCommentID(ctx context.Context, owner, repo str
return 0, fmt.Errorf("no timeline event found with sentinel") return 0, fmt.Errorf("no timeline event found with sentinel")
} }
// GetTimelineReviewCommentIDForReview finds the timeline comment ID for a
// specific review by matching its body content in the timeline.
func (c *Client) GetTimelineReviewCommentIDForReview(ctx context.Context, owner, repo string, number int, reviewID int64) (int64, error) {
// Use the reviews API to get the review body, then find in timeline
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d/reviews/%d",
c.baseURL,
url.PathEscape(owner),
url.PathEscape(repo),
number,
reviewID)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return 0, fmt.Errorf("get review %d: %w", reviewID, err)
}
var review struct {
Body string `json:"body"`
User struct {
Login string `json:"login"`
} `json:"user"`
}
if err := json.Unmarshal(body, &review); err != nil {
return 0, fmt.Errorf("parse review %d: %w", reviewID, err)
}
if review.Body == "" {
return 0, fmt.Errorf("review %d has empty body", reviewID)
}
// Use a prefix for matching (handles minor trailing whitespace differences)
matchPrefix := review.Body
if len(matchPrefix) > 200 {
matchPrefix = matchPrefix[:200]
}
const pageSize = 50
for page := 1; ; page++ {
timelineURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/issues/%d/timeline?limit=%d&page=%d",
c.baseURL,
url.PathEscape(owner),
url.PathEscape(repo),
number,
pageSize,
page)
tlBody, err := c.doGet(ctx, timelineURL)
if err != nil {
return 0, fmt.Errorf("get timeline (page %d): %w", page, err)
}
var events []TimelineEvent
if err := json.Unmarshal(tlBody, &events); err != nil {
return 0, fmt.Errorf("parse timeline (page %d): %w", page, err)
}
for _, ev := range events {
if ev.Type == "review" && ev.User.Login == review.User.Login && strings.HasPrefix(ev.Body, matchPrefix) {
return ev.ID, nil
}
}
if len(events) < pageSize {
break
}
}
return 0, fmt.Errorf("no timeline event found for review %d", reviewID)
}
// EditComment updates the body of an issue/review comment. // EditComment updates the body of an issue/review comment.
func (c *Client) EditComment(ctx context.Context, owner, repo string, commentID int64, newBody string) error { func (c *Client) EditComment(ctx context.Context, owner, repo string, commentID int64, newBody string) error {
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/issues/comments/%d", reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/issues/comments/%d",
+232
View File
@@ -29,8 +29,13 @@ func ParseResponse(response string) (*ReviewResult, error) {
var result ReviewResult var result ReviewResult
if err := json.Unmarshal([]byte(cleaned), &result); err != nil { if err := json.Unmarshal([]byte(cleaned), &result); err != nil {
// LLMs sometimes produce JSON with unescaped quotes inside string values.
// Try to repair before giving up.
repaired := repairJSON(cleaned)
if err2 := json.Unmarshal([]byte(repaired), &result); err2 != nil {
return nil, fmt.Errorf("parse LLM response as JSON: %w\nRaw response: %s", err, response) return nil, fmt.Errorf("parse LLM response as JSON: %w\nRaw response: %s", err, response)
} }
}
// Validate verdict // Validate verdict
switch result.Verdict { switch result.Verdict {
@@ -74,3 +79,230 @@ func extractJSON(s string) string {
s = strings.TrimSpace(s) s = strings.TrimSpace(s)
return s return s
} }
// repairJSON attempts to fix common LLM JSON issues:
// - Unescaped double quotes inside string values
//
// Strategy: walk the JSON structurally. Object keys are parsed normally (LLMs
// get those right). For string VALUES, we find all candidate closing quotes and
// pick the LAST one that leaves valid JSON structure afterward — maximizing
// string content, which is the correct bias for the "LLM put unescaped quotes
// in a string value" failure mode.
func repairJSON(s string) string {
runes := []rune(s)
var out strings.Builder
out.Grow(len(s) + 64)
i := 0
for i < len(runes) {
c := runes[i]
if c != '"' {
out.WriteRune(c)
i++
continue
}
// We hit an opening quote. Determine if this is a key or a value.
// Keys: the standard JSON parser in LLMs gets keys right, so we parse
// them normally (first unescaped quote closes).
// Values: may contain unescaped quotes — use the repair heuristic.
isValue := isValuePosition(runes, i)
if !isValue {
// Parse key/simple string normally
out.WriteRune('"')
i++
for i < len(runes) {
ch := runes[i]
if ch == '\\' && i+1 < len(runes) {
out.WriteRune(ch)
i++
out.WriteRune(runes[i])
i++
continue
}
if ch == '"' {
out.WriteRune('"')
i++
break
}
out.WriteRune(ch)
i++
}
continue
}
// Value string — find the correct close using last-valid-candidate heuristic
out.WriteRune('"')
i++
closeIdx := findClosingQuote(runes, i)
// Write everything between open and close, escaping interior quotes
for j := i; j < closeIdx; j++ {
ch := runes[j]
if ch == '\\' && j+1 < closeIdx {
// Already-escaped sequence — pass through
out.WriteRune(ch)
j++
out.WriteRune(runes[j])
} else if ch == '"' {
out.WriteRune('\\')
out.WriteRune('"')
} else {
out.WriteRune(ch)
}
}
// Write the closing quote
out.WriteRune('"')
i = closeIdx + 1
}
return out.String()
}
// isValuePosition determines if the quote at position i is opening a JSON value
// string (as opposed to an object key). We only apply repair to values that
// follow ':' since those are the free-text fields where LLMs produce unescaped
// quotes. Array elements and keys are left alone (parsed normally).
func isValuePosition(runes []rune, i int) bool {
// Look backward, skipping whitespace, for the preceding structural char
j := i - 1
for j >= 0 && (runes[j] == ' ' || runes[j] == '\t' || runes[j] == '\n' || runes[j] == '\r') {
j--
}
if j < 0 {
return false
}
// After ':' → definitely a value
return runes[j] == ':'
}
// findClosingQuote finds the index of the true closing quote for a JSON string
// value starting at position start (the character after the opening quote).
// It collects all unescaped quote candidates and returns the FIRST one that
// produces valid JSON continuation (deeper lookahead verifies the next token).
func findClosingQuote(runes []rune, start int) int {
// Collect all candidate positions for the closing quote.
var candidates []int
for j := start; j < len(runes); j++ {
if runes[j] == '\\' {
j++ // skip escaped character
continue
}
if runes[j] == '"' {
candidates = append(candidates, j)
}
}
if len(candidates) == 0 {
return len(runes)
}
if len(candidates) == 1 {
return candidates[0]
}
// Try candidates from FIRST to LAST. The correct closing quote is the
// earliest one that produces valid JSON structure after it (verified by
// deeper lookahead that checks the next token is a valid JSON start).
for _, idx := range candidates {
if isValidJSONAfterClose(runes, idx+1) {
return idx
}
}
// Fallback: return the last candidate
return candidates[len(candidates)-1]
}
// isValidJSONAfterClose checks whether the runes after a candidate closing quote
// look like valid JSON continuation for a VALUE string. Since we only use this
// for value positions, ':' is NOT a valid continuation (values are never keys).
// Checks deeper structure to avoid being fooled by JSON-like content in strings.
func isValidJSONAfterClose(runes []rune, pos int) bool {
j := pos
for j < len(runes) && (runes[j] == ' ' || runes[j] == '\t' || runes[j] == '\n' || runes[j] == '\r') {
j++
}
if j >= len(runes) {
return true
}
next := runes[j]
if next == '}' || next == ']' {
// Closing a container. Verify what follows the close is also valid:
// another structural char, comma, or EOF.
return isValidAfterContainerClose(runes, j+1)
}
if next == ',' {
// After comma, must be followed by a valid JSON token
j++
for j < len(runes) && (runes[j] == ' ' || runes[j] == '\t' || runes[j] == '\n' || runes[j] == '\r') {
j++
}
if j >= len(runes) {
return false // trailing comma with nothing after — invalid
}
return isJSONTokenStart(runes, j)
}
// ':' is NOT valid here — we're in a value position, not a key.
// Any other character is also invalid.
return false
}
// isValidAfterContainerClose checks that after a } or ], the continuation is
// structurally valid: more closes, comma+token, or EOF.
func isValidAfterContainerClose(runes []rune, pos int) bool {
j := pos
for j < len(runes) && (runes[j] == ' ' || runes[j] == '\t' || runes[j] == '\n' || runes[j] == '\r') {
j++
}
if j >= len(runes) {
return true
}
next := runes[j]
if next == '}' || next == ']' {
return isValidAfterContainerClose(runes, j+1)
}
if next == ',' {
j++
for j < len(runes) && (runes[j] == ' ' || runes[j] == '\t' || runes[j] == '\n' || runes[j] == '\r') {
j++
}
if j >= len(runes) {
return false
}
return isJSONTokenStart(runes, j)
}
return false
}
// isJSONTokenStart returns true if the rune could begin a JSON value or key.
// For keywords (true/false/null), verifies the full keyword is present.
func isJSONTokenStart(runes []rune, pos int) bool {
if pos >= len(runes) {
return false
}
r := runes[pos]
switch {
case r == '"': // string
return true
case r == '{' || r == '[': // object or array
return true
case r == 't': // true
return pos+4 <= len(runes) && string(runes[pos:pos+4]) == "true"
case r == 'f': // false
return pos+5 <= len(runes) && string(runes[pos:pos+5]) == "false"
case r == 'n': // null
return pos+4 <= len(runes) && string(runes[pos:pos+4]) == "null"
case r >= '0' && r <= '9': // number
return true
case r == '-': // negative number
return true
}
return false
}
+110
View File
@@ -1,6 +1,7 @@
package review package review
import ( import (
"encoding/json"
"testing" "testing"
) )
@@ -112,3 +113,112 @@ func TestParseResponse_MarkdownFencesNoLang(t *testing.T) {
t.Errorf("expected APPROVE, got %q", result.Verdict) t.Errorf("expected APPROVE, got %q", result.Verdict)
} }
} }
func TestParseResponse_UnescapedQuotesInStrings(t *testing.T) {
// Real failure from CI: Sonnet puts unescaped quotes like (e.g. "28") in findings
input := `{"verdict": "APPROVE", "summary": "Clean PR", "findings": [{"severity": "NIT", "file": "ci/Dockerfile", "line": 14, "finding": "The comment says OTP_VERSION is the major version (e.g. \"28\") but it actually contains unescaped quotes like (e.g. "28") which breaks JSON"}], "recommendation": "Ship it"}`
result, err := ParseResponse(input)
if err != nil {
t.Fatalf("expected repair to handle unescaped quotes, got error: %v", err)
}
if result.Verdict != "APPROVE" {
t.Errorf("expected APPROVE, got %q", result.Verdict)
}
if len(result.Findings) != 1 {
t.Fatalf("expected 1 finding, got %d", len(result.Findings))
}
}
func TestRepairJSON_NoOpOnValid(t *testing.T) {
valid := `{"key": "value", "num": 42}`
result := repairJSON(valid)
if result != valid {
t.Errorf("repairJSON should not modify valid JSON\n got: %s\n want: %s", result, valid)
}
}
func TestRepairJSON_FixesUnescapedQuotes(t *testing.T) {
// Interior quote followed by non-structural character
input := `{"msg": "use "foo" here"}`
result := repairJSON(input)
// Should be parseable now
var m map[string]interface{}
if err := json.Unmarshal([]byte(result), &m); err != nil {
t.Fatalf("repaired JSON should parse, got: %v\nrepaired: %s", err, result)
}
}
func TestRepairJSON_InteriorQuoteBeforeComma(t *testing.T) {
// Bug reported by reviewer: interior quoted word immediately before a comma
input := `{"msg": "say "yes", and go"}`
result := repairJSON(input)
var m map[string]interface{}
if err := json.Unmarshal([]byte(result), &m); err != nil {
t.Fatalf("repaired JSON should parse, got: %v\nrepaired: %s", err, result)
}
// The full string content should be preserved
msg, ok := m["msg"].(string)
if !ok {
t.Fatal("msg field missing or not a string")
}
if msg != `say "yes", and go` {
t.Errorf("unexpected msg content: %q", msg)
}
}
func TestRepairJSON_InteriorQuoteBeforeCloseBrace(t *testing.T) {
// Bug reported by reviewer: JSON-shaped syntax inside string values
input := `{"msg": "input map {"key": "val"} caused error"}`
result := repairJSON(input)
var m map[string]interface{}
if err := json.Unmarshal([]byte(result), &m); err != nil {
t.Fatalf("repaired JSON should parse, got: %v\nrepaired: %s", err, result)
}
}
func TestRepairJSON_MultipleFields(t *testing.T) {
// Multiple string fields with unescaped quotes in different positions
input := `{"a": "hello "world"", "b": "foo"}`
result := repairJSON(input)
var m map[string]interface{}
if err := json.Unmarshal([]byte(result), &m); err != nil {
t.Fatalf("repaired JSON should parse, got: %v\nrepaired: %s", err, result)
}
if _, ok := m["b"]; !ok {
t.Error("expected 'b' field to be preserved")
}
}
func TestRepairJSON_PreservesEscapedQuotes(t *testing.T) {
// Already-escaped quotes should not be double-escaped
input := `{"msg": "already \"escaped\" here"}`
result := repairJSON(input)
if result != input {
t.Errorf("repairJSON should not modify already-escaped quotes\n got: %s\n want: %s", result, input)
}
var m map[string]interface{}
if err := json.Unmarshal([]byte(result), &m); err != nil {
t.Fatalf("repaired JSON should parse, got: %v\nrepaired: %s", err, result)
}
}
func TestRepairJSON_ComplexNestedContent(t *testing.T) {
// Combines both reviewer bugs: quoted words before commas AND JSON-like content
input := `{"verdict": "APPROVE", "findings": [{"finding": "The map {"key": "val"} and (e.g. "28") and say "yes", then stop"}]}`
result := repairJSON(input)
var parsed map[string]interface{}
if err := json.Unmarshal([]byte(result), &parsed); err != nil {
t.Fatalf("repaired JSON should parse, got: %v\nrepaired: %s", err, result)
}
if parsed["verdict"] != "APPROVE" {
t.Errorf("expected verdict APPROVE, got %v", parsed["verdict"])
}
}