docs: allow approved third-party packages #59
@@ -3,6 +3,7 @@
|
||||
# Exit 1 if any unapproved import is found.
|
||||
#
|
||||
# The allowlist is parsed from CONVENTIONS.md to maintain a single source of truth.
|
||||
# Also enforces Scope column: "test only" packages cannot appear in non-test code.
|
||||
|
||||
set -euo pipefail
|
||||
|
|
||||
|
||||
@@ -13,22 +14,53 @@ if [ ! -f "$CONVENTIONS_FILE" ]; then
|
||||
exit 1
|
||||
|
sonnet-review-bot
commented
[MINOR] The ALLOWED_PATTERN variable is built but never used. The actual checking is done in the loop with direct string comparison against the ALLOWED array. The dead code is harmless but confusing — either use the pattern (e.g., with grep) or remove it. **[MINOR]** The ALLOWED_PATTERN variable is built but never used. The actual checking is done in the loop with direct string comparison against the ALLOWED array. The dead code is harmless but confusing — either use the pattern (e.g., with grep) or remove it.
|
||||
fi
|
||||
|
gpt-review-bot
commented
[NIT] ALLOWED_PATTERN is built but never used. Remove this dead code to reduce confusion and keep the script minimal. **[NIT]** ALLOWED_PATTERN is built but never used. Remove this dead code to reduce confusion and keep the script minimal.
|
||||
|
||||
# Parse approved packages from CONVENTIONS.md table
|
||||
# Looks for lines like: | `gopkg.in/yaml.v3` | ...
|
||||
ALLOWED=()
|
||||
while IFS= read -r line; do
|
||||
# Extract package from markdown table cell: | `package` |
|
||||
pkg=$(echo "$line" | grep -oP '\| `\K[^`]+' | head -1 || true)
|
||||
if [ -n "$pkg" ] && [[ "$pkg" != "Package" ]]; then
|
||||
ALLOWED+=("$pkg")
|
||||
fi
|
||||
done < <(grep -E '^\| `[a-zA-Z]' "$CONVENTIONS_FILE" || true)
|
||||
# Parse approved packages from CONVENTIONS.md table using awk (POSIX-compatible)
|
||||
# Format: | `package` | use case | scope |
|
||||
# Output: package:scope (e.g., "gopkg.in/yaml.v3:production")
|
||||
declare -A ALLOWED_PROD=()
|
||||
|
gpt-review-bot
commented
[MAJOR] The script does not enforce the 'Scope' column from CONVENTIONS.md (e.g., allowing github.com/google/go-cmp only in tests). As written, any listed package is allowed in production code, contradicting the documented policy. **[MAJOR]** The script does not enforce the 'Scope' column from CONVENTIONS.md (e.g., allowing github.com/google/go-cmp only in tests). As written, any listed package is allowed in production code, contradicting the documented policy.
gpt-review-bot
commented
[MINOR] The script relies on bash associative arrays ( **[MINOR]** The script relies on bash associative arrays (`declare -A`), which require Bash 4+. On macOS (default Bash 3.2), this will fail locally for contributors running `make precommit`. Consider documenting the requirement, invoking a known shell (e.g., via `env bash` with ensured version), or rewriting to avoid associative arrays.
|
||||
declare -A ALLOWED_TEST=()
|
||||
|
||||
|
sonnet-review-bot
commented
[NIT] The comment says 'POSIX-compatible' but **[NIT]** The comment says 'POSIX-compatible' but `declare -A` (associative arrays) is a bash 4+ feature and is not POSIX. The comment should say 'bash 4+' to accurately describe the requirement. macOS ships bash 3.2 by default (GPLv2 licensing constraint), so this script would fail on stock macOS without Homebrew bash. Worth documenting the bash version requirement in a shebang comment or README.
|
||||
if [ ${#ALLOWED[@]} -eq 0 ]; then
|
||||
while IFS= read -r line; do
|
||||
|
gpt-review-bot
commented
[MINOR] Uses 'grep -P' (Perl regex), which is not available on macOS/BSD grep by default. This reduces cross-platform developer usability for the precommit hook. **[MINOR]** Uses 'grep -P' (Perl regex), which is not available on macOS/BSD grep by default. This reduces cross-platform developer usability for the precommit hook.
|
||||
# Use awk to extract package and scope from table row
|
||||
|
[MINOR] The script fails open if **[MINOR]** The script fails open if `go list -m all` errors (`|| true`), then treats an empty IMPORTS as "no external dependencies" and exits 0. This can bypass enforcement if the Go toolchain is unavailable or the command fails. Prefer failing closed on errors or explicitly erroring when `go list` fails.
|
||||
# Split on | and extract backtick-wrapped package from column 1, scope from column 3
|
||||
pkg=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]*`|`[[:space:]]*$/, "", $2); print $2}')
|
||||
|
gpt-review-bot
commented
[MINOR] Table parsing relies on a brittle grep pattern ('^| `[a-zA-Z]') and a single-code-cell extraction. Future formatting changes (leading spaces, additional code spans) or package names starting with non-letters may cause false negatives/positives. **[MINOR]** Table parsing relies on a brittle grep pattern ('^\| `[a-zA-Z]') and a single-code-cell extraction. Future formatting changes (leading spaces, additional code spans) or package names starting with non-letters may cause false negatives/positives.
|
||||
scope=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]+|[[:space:]]+$/, "", $4); print tolower($4)}')
|
||||
|
sonnet-review-bot
commented
[MINOR] **[MINOR]** `go list -m all` includes indirect/transitive dependencies. If yaml.v3 or go-cmp pull in their own transitive deps, those would appear here and trigger false-positive violations. Consider using `go mod edit -json | jq '.Require[] | select(.Indirect == false) | .Path'` or filtering with `go list -m -mod=mod -f '{{if not .Indirect}}{{.Path}}{{end}}' all` to check only direct dependencies.
|
||||
|
||||
if [ -n "$pkg" ] && [ "$pkg" != "Package" ] && [[ "$pkg" =~ ^[a-zA-Z] ]]; then
|
||||
if [[ "$scope" == *"test"* ]]; then
|
||||
ALLOWED_TEST["$pkg"]=1
|
||||
else
|
||||
|
gpt-review-bot
commented
[NIT] Parsing the markdown table with **[NIT]** Parsing the markdown table with `grep`/`awk` is somewhat brittle (e.g., relies on backticks around package names and exact column positions). This is acceptable given the documented process, but a brief note in CONVENTIONS.md to preserve formatting would help avoid accidental breakage.
|
||||
ALLOWED_PROD["$pkg"]=1
|
||||
fi
|
||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `-oP` flag for PCRE-style regex (`\K`) is a GNU grep extension and is not available on macOS (BSD grep). This will silently fail to parse the allowlist on macOS developer machines (the `|| true` swallows the error), leaving ALLOWED empty and triggering the warning but not enforcing anything. Consider using `sed` instead: `echo "$line" | sed -n "s/^| \`\([^\`]*\)\`.*/\1/p"`, or document that GNU grep is required.
|
||||
fi
|
||||
|
gpt-review-bot
commented
[MINOR] The filter **[MINOR]** The filter `[[ "$pkg" =~ ^[a-zA-Z] ]]` rejects valid import paths that begin with a digit (e.g., 9fans.net/go). Consider relaxing to `^[[:alnum:]]` or removing the check, since the header row is already excluded by the grep.
|
||||
done < <(grep '| `' "$CONVENTIONS_FILE" 2>/dev/null || true)
|
||||
|
||||
ALL_ALLOWED=("${!ALLOWED_PROD[@]}" "${!ALLOWED_TEST[@]}")
|
||||
|
||||
if [ ${#ALL_ALLOWED[@]} -eq 0 ]; then
|
||||
echo "⚠️ No approved packages found in $CONVENTIONS_FILE"
|
||||
echo " (This is fine if you want stdlib-only)"
|
||||
fi
|
||||
|
||||
|
sonnet-review-bot
commented
[MINOR] **[MINOR]** `go list -m -f '{{if not .Indirect}}{{.Path}}{{end}}' all` lists module-level direct dependencies, not package-level imports. This means a module in go.mod that isn't actually imported in any Go source file will still be flagged. More precisely, it will miss the case where someone adds a `go.mod` entry for an unapproved module but the script will catch modules not in go.mod at all (since they wouldn't be listed). This is probably fine for the intended purpose, but the comment 'DIRECT dependencies only' may mislead — it means direct in the go.mod sense, not 'actually used in source'. Consider a brief clarifying comment.
|
||||
# Helper: check if import matches any package in an associative array (literal prefix, no glob)
|
||||
matches_allowlist() {
|
||||
local import="$1"
|
||||
shift
|
||||
local -n allowlist=$1
|
||||
|
||||
for allowed in "${!allowlist[@]}"; do
|
||||
|
sonnet-review-bot
commented
[NIT] **[NIT]** `echo -e` is not POSIX-compliant and behavior varies across shells (e.g., on some systems /bin/bash may not support -e with printf semantics). The `\n` in the violations string also won't render on all platforms. Consider using `printf` instead: `printf 'The following imports are not in the allowlist:%b\n' "$VIOLATIONS"`.
|
||||
# Exact match
|
||||
if [ "$import" = "$allowed" ]; then
|
||||
return 0
|
||||
|
[MINOR] The prefix match uses , which treats the right side as a glob pattern. While allowed entries are controlled via review, a metacharacter (e.g., *, ?, [) in an approved package could unintentionally broaden matches. Consider a literal prefix check to avoid glob semantics. **[MINOR]** The prefix match uses [[ "$import" == "$allowed/"* ]], which treats the right side as a glob pattern. While allowed entries are controlled via review, a metacharacter (e.g., *, ?, [) in an approved package could unintentionally broaden matches. Consider a literal prefix check to avoid glob semantics.
|
||||
fi
|
||||
# Literal prefix match for subpackages (no glob interpretation)
|
||||
if [ "${import#"$allowed/"}" != "$import" ]; then
|
||||
|
sonnet-review-bot
commented
[NIT] The comment says 'POSIX-compatible' for the awk parsing but the outer loop uses a Bash process substitution **[NIT]** The comment says 'POSIX-compatible' for the awk parsing but the outer loop uses a Bash process substitution `< <(...)` which is Bash-specific. The comment is mildly misleading — it means awk itself is POSIX, not the overall approach.
|
||||
return 0
|
||||
fi
|
||||
done
|
||||
return 1
|
||||
}
|
||||
|
||||
# Get DIRECT dependencies only (exclude indirect/transitive)
|
||||
# Fail closed: if go list fails, we exit non-zero
|
||||
|
gpt-review-bot
commented
[MAJOR] The script checks only direct non-test modules via **[MAJOR]** The script checks only direct non-test modules via `go list -m ... all` and never verifies test-only dependencies against the allowlist. As a result, importing any new third-party package in tests (e.g., in *_test.go) will not be flagged unless also used in production code. This violates the documented 'STRICT ALLOWLIST' policy for tests.
|
||||
IMPORTS=$(go list -m -f '{{if not .Indirect}}{{.Path}}{{end}}' all 2>&1) || {
|
||||
@@ -44,21 +76,13 @@ if [ -z "$IMPORTS" ]; then
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Check all direct dependencies are in the allowlist
|
||||
VIOLATIONS=""
|
||||
while IFS= read -r import; do
|
||||
[ -z "$import" ] && continue
|
||||
|
||||
# Check if import matches any allowed package (prefix match for subpackages)
|
||||
MATCHED=false
|
||||
for allowed in "${ALLOWED[@]}"; do
|
||||
if [[ "$import" == "$allowed" ]] || [[ "$import" == "$allowed/"* ]]; then
|
||||
MATCHED=true
|
||||
break
|
||||
fi
|
||||
done
|
||||
|
||||
if [ "$MATCHED" = false ]; then
|
||||
VIOLATIONS="${VIOLATIONS} - ${import}"$'\n'
|
||||
if ! matches_allowlist "$import" ALLOWED_PROD && ! matches_allowlist "$import" ALLOWED_TEST; then
|
||||
|
sonnet-review-bot
commented
[NIT] The final line counts direct deps via **[NIT]** The final line counts direct deps via `wc -l` on `$IMPORTS`, but `$IMPORTS` could contain blank lines depending on the `go list` output format when some template expansions produce empty strings. The `grep -v '^$'` earlier helps but the count may still be off by one if there's a trailing newline. Minor cosmetic issue.
|
||||
VIOLATIONS="${VIOLATIONS} - ${import} (not in allowlist)"$'\n'
|
||||
fi
|
||||
done <<< "$IMPORTS"
|
||||
|
||||
@@ -68,17 +92,31 @@ if [ -n "$VIOLATIONS" ]; then
|
||||
echo "The following imports are not in the allowlist:"
|
||||
printf "%s" "$VIOLATIONS"
|
||||
echo ""
|
||||
echo "Approved packages (from CONVENTIONS.md):"
|
||||
for pkg in "${ALLOWED[@]}"; do
|
||||
echo " - $pkg"
|
||||
done
|
||||
echo "To add a dependency, update CONVENTIONS.md (requires Aaron's approval)"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Enforce Scope: test-only packages must not appear in non-test code
|
||||
|
sonnet-review-bot
commented
[MINOR] The test-scope enforcement uses **[MINOR]** The test-scope enforcement uses `go list -deps -f '...' ./... | grep -v '_test'` to identify production imports, but this approach is fragile: it filters by path containing '_test', which catches test packages but the `-deps` flag on non-test packages already excludes test files. More importantly, this will fail to detect a test-only package imported via a non-test file in a sub-package whose path happens not to contain '_test'. Using `go list -f '{{if not .Standard}}{{.ImportPath}}{{end}}' $(go list ./...)` (without `-deps`) and then checking against the allowlist directly would be more precise. That said, the current approach is a reasonable heuristic.
gpt-review-bot
commented
[NIT] The **[NIT]** The `grep -v '_test'` in the production imports step is unnecessary because `go list -deps` without `-test` already excludes test-only dependencies. While harmless, it can be removed for clarity.
|
||||
# Get imports used by non-test code only
|
||||
PROD_IMPORTS=$(go list -deps -f '{{if not .Standard}}{{.ImportPath}}{{end}}' ./... 2>/dev/null | grep -v '_test' || true)
|
||||
|
[MINOR] Production imports discovery allows go list to fail open (2>/dev/null | ... || true). If go list fails, PROD_IMPORTS becomes empty and the script won’t detect misuse of test-only dependencies, weakening enforcement. Failing closed here would improve robustness. **[MINOR]** Production imports discovery allows go list to fail open (2>/dev/null | ... || true). If go list fails, PROD_IMPORTS becomes empty and the script won’t detect misuse of test-only dependencies, weakening enforcement. Failing closed here would improve robustness.
|
||||
|
||||
TEST_ONLY_IN_PROD=""
|
||||
|
gpt-review-bot
commented
[MINOR] The scope enforcement uses **[MINOR]** The scope enforcement uses `grep -q "^${test_pkg}"` which can produce false positives for modules with a similar prefix (e.g., github.com/google/go-cmppro would match github.com/google/go-cmp). Use a delimiter-aware pattern like `^${test_pkg}(/|$)` to avoid accidental matches.
|
||||
for test_pkg in "${!ALLOWED_TEST[@]}"; do
|
||||
|
sonnet-review-bot
commented
[MINOR] The regex **[MINOR]** The regex `"^${test_pkg}(/|\$|$)"` has a redundant `$` — `\$` escapes the literal dollar sign for the shell but in the regex context inside `grep -qE`, the trailing `|$)` is a proper end-of-line anchor followed by the closing group, making `\$` (literal `$` character) unreachable in practice. This is harmless but the intent (match exact package or subpackage) is already handled by the `^pkg($|/)` pattern. Consider simplifying to `"^${test_pkg}(/|$)"`.
|
||||
if echo "$PROD_IMPORTS" | grep -q "^${test_pkg}"; then
|
||||
|
[MINOR] The grep check for test-only packages in production uses an unescaped regex pattern (grep -q "^${test_pkg}"). If a package name contains regex metacharacters (e.g., dots), this may match unintended imports. While not a code execution risk, it can cause false positives/negatives in enforcement. **[MINOR]** The grep check for test-only packages in production uses an unescaped regex pattern (grep -q "^${test_pkg}"). If a package name contains regex metacharacters (e.g., dots), this may match unintended imports. While not a code execution risk, it can cause false positives/negatives in enforcement.
|
||||
TEST_ONLY_IN_PROD="${TEST_ONLY_IN_PROD} - ${test_pkg} (marked 'test only' but used in production code)"$'\n'
|
||||
fi
|
||||
done
|
||||
|
sonnet-review-bot
commented
[MINOR] The test-only production check uses **[MINOR]** The test-only production check uses `grep -q "^${test_pkg}"` without anchoring the end of the pattern. A package path like `github.com/google/go-cmp` would inadvertently match a hypothetical `github.com/google/go-cmp-extended`. Should be `grep -qE "^${test_pkg}(/|$)"` to match only exact or subpackage imports.
|
||||
|
||||
|
gpt-review-bot
commented
[MINOR] The grep regex **[MINOR]** The grep regex `^${test_pkg}(/|\$|$)` redundantly includes both `\$` and `$`. It can be simplified to `^${test_pkg}(/|$)` without changing semantics.
|
||||
if [ -n "$TEST_ONLY_IN_PROD" ]; then
|
||||
echo "❌ TEST-ONLY DEPENDENCIES IN PRODUCTION CODE"
|
||||
echo ""
|
||||
echo "To add a dependency:"
|
||||
echo " 1. Open a PR that ONLY updates CONVENTIONS.md"
|
||||
echo " 2. Get explicit approval from Aaron"
|
||||
echo " 3. After merge, use the package in a separate PR"
|
||||
printf "%s" "$TEST_ONLY_IN_PROD"
|
||||
echo ""
|
||||
echo "These packages are marked 'test only' in CONVENTIONS.md"
|
||||
echo "and must only be imported from *_test.go files."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "✅ All dependencies are approved"
|
||||
echo " Direct deps: $(echo "$IMPORTS" | wc -l | tr -d ' ')"
|
||||
echo " Production: ${#ALLOWED_PROD[@]}, Test-only: ${#ALLOWED_TEST[@]}"
|
||||
|
||||
[MINOR] The allowlist is duplicated between CONVENTIONS.md and this script (
ALLOWEDarray), creating a risk of drift. A mismatch could either block approved deps or unintentionally allow unapproved ones, weakening policy enforcement. Consider deriving the list from a single source (e.g., parse CONVENTIONS.md or define the list in a single machine-readable file).[MAJOR] The allowlist is hard-coded in the script (ALLOWED array) while CONVENTIONS.md is presented as the source of truth. The documented process says to open a PR that ONLY updates the table, but the enforcement will still fail later unless this script is updated too. This mismatch will cause policy drift and block subsequent PRs using newly approved dependencies.