docs: allow approved third-party packages #59

Merged
rodin merged 4 commits from allow-deps into main 2026-05-10 21:07:10 +00:00
Showing only changes of commit aeb0c8cb79 - Show all commits
+68 -30
View File
@@ -3,6 +3,7 @@
# Exit 1 if any unapproved import is found.
#
# The allowlist is parsed from CONVENTIONS.md to maintain a single source of truth.
# Also enforces Scope column: "test only" packages cannot appear in non-test code.
set -euo pipefail
Outdated
Review

[MINOR] The allowlist is duplicated between CONVENTIONS.md and this script (ALLOWED array), creating a risk of drift. A mismatch could either block approved deps or unintentionally allow unapproved ones, weakening policy enforcement. Consider deriving the list from a single source (e.g., parse CONVENTIONS.md or define the list in a single machine-readable file).

**[MINOR]** The allowlist is duplicated between CONVENTIONS.md and this script (`ALLOWED` array), creating a risk of drift. A mismatch could either block approved deps or unintentionally allow unapproved ones, weakening policy enforcement. Consider deriving the list from a single source (e.g., parse CONVENTIONS.md or define the list in a single machine-readable file).
Outdated
Review

[MAJOR] The allowlist is hard-coded in the script (ALLOWED array) while CONVENTIONS.md is presented as the source of truth. The documented process says to open a PR that ONLY updates the table, but the enforcement will still fail later unless this script is updated too. This mismatch will cause policy drift and block subsequent PRs using newly approved dependencies.

**[MAJOR]** The allowlist is hard-coded in the script (ALLOWED array) while CONVENTIONS.md is presented as the source of truth. The documented process says to open a PR that ONLY updates the table, but the enforcement will still fail later unless this script is updated too. This mismatch will cause policy drift and block subsequent PRs using newly approved dependencies.
@@ -13,22 +14,53 @@ if [ ! -f "$CONVENTIONS_FILE" ]; then
exit 1
Outdated
Review

[MINOR] The ALLOWED_PATTERN variable is built but never used. The actual checking is done in the loop with direct string comparison against the ALLOWED array. The dead code is harmless but confusing — either use the pattern (e.g., with grep) or remove it.

**[MINOR]** The ALLOWED_PATTERN variable is built but never used. The actual checking is done in the loop with direct string comparison against the ALLOWED array. The dead code is harmless but confusing — either use the pattern (e.g., with grep) or remove it.
fi
Outdated
Review

[NIT] ALLOWED_PATTERN is built but never used. Remove this dead code to reduce confusion and keep the script minimal.

**[NIT]** ALLOWED_PATTERN is built but never used. Remove this dead code to reduce confusion and keep the script minimal.
# Parse approved packages from CONVENTIONS.md table
# Looks for lines like: | `gopkg.in/yaml.v3` | ...
ALLOWED=()
while IFS= read -r line; do
# Extract package from markdown table cell: | `package` |
pkg=$(echo "$line" | grep -oP '\| `\K[^`]+' | head -1 || true)
if [ -n "$pkg" ] && [[ "$pkg" != "Package" ]]; then
ALLOWED+=("$pkg")
fi
done < <(grep -E '^\| `[a-zA-Z]' "$CONVENTIONS_FILE" || true)
# Parse approved packages from CONVENTIONS.md table using awk (POSIX-compatible)
# Format: | `package` | use case | scope |
# Output: package:scope (e.g., "gopkg.in/yaml.v3:production")
declare -A ALLOWED_PROD=()
Outdated
Review

[MAJOR] The script does not enforce the 'Scope' column from CONVENTIONS.md (e.g., allowing github.com/google/go-cmp only in tests). As written, any listed package is allowed in production code, contradicting the documented policy.

**[MAJOR]** The script does not enforce the 'Scope' column from CONVENTIONS.md (e.g., allowing github.com/google/go-cmp only in tests). As written, any listed package is allowed in production code, contradicting the documented policy.
Outdated
Review

[MINOR] The script relies on bash associative arrays (declare -A), which require Bash 4+. On macOS (default Bash 3.2), this will fail locally for contributors running make precommit. Consider documenting the requirement, invoking a known shell (e.g., via env bash with ensured version), or rewriting to avoid associative arrays.

**[MINOR]** The script relies on bash associative arrays (`declare -A`), which require Bash 4+. On macOS (default Bash 3.2), this will fail locally for contributors running `make precommit`. Consider documenting the requirement, invoking a known shell (e.g., via `env bash` with ensured version), or rewriting to avoid associative arrays.
declare -A ALLOWED_TEST=()
Outdated
Review

[NIT] The comment says 'POSIX-compatible' but declare -A (associative arrays) is a bash 4+ feature and is not POSIX. The comment should say 'bash 4+' to accurately describe the requirement. macOS ships bash 3.2 by default (GPLv2 licensing constraint), so this script would fail on stock macOS without Homebrew bash. Worth documenting the bash version requirement in a shebang comment or README.

**[NIT]** The comment says 'POSIX-compatible' but `declare -A` (associative arrays) is a bash 4+ feature and is not POSIX. The comment should say 'bash 4+' to accurately describe the requirement. macOS ships bash 3.2 by default (GPLv2 licensing constraint), so this script would fail on stock macOS without Homebrew bash. Worth documenting the bash version requirement in a shebang comment or README.
if [ ${#ALLOWED[@]} -eq 0 ]; then
while IFS= read -r line; do
Review

[MINOR] Uses 'grep -P' (Perl regex), which is not available on macOS/BSD grep by default. This reduces cross-platform developer usability for the precommit hook.

**[MINOR]** Uses 'grep -P' (Perl regex), which is not available on macOS/BSD grep by default. This reduces cross-platform developer usability for the precommit hook.
# Use awk to extract package and scope from table row
Outdated
Review

[MINOR] The script fails open if go list -m all errors (|| true), then treats an empty IMPORTS as "no external dependencies" and exits 0. This can bypass enforcement if the Go toolchain is unavailable or the command fails. Prefer failing closed on errors or explicitly erroring when go list fails.

**[MINOR]** The script fails open if `go list -m all` errors (`|| true`), then treats an empty IMPORTS as "no external dependencies" and exits 0. This can bypass enforcement if the Go toolchain is unavailable or the command fails. Prefer failing closed on errors or explicitly erroring when `go list` fails.
# Split on | and extract backtick-wrapped package from column 1, scope from column 3
pkg=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]*`|`[[:space:]]*$/, "", $2); print $2}')
Outdated
Review

[MINOR] Table parsing relies on a brittle grep pattern ('^| `[a-zA-Z]') and a single-code-cell extraction. Future formatting changes (leading spaces, additional code spans) or package names starting with non-letters may cause false negatives/positives.

**[MINOR]** Table parsing relies on a brittle grep pattern ('^\| `[a-zA-Z]') and a single-code-cell extraction. Future formatting changes (leading spaces, additional code spans) or package names starting with non-letters may cause false negatives/positives.
scope=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]+|[[:space:]]+$/, "", $4); print tolower($4)}')
Outdated
Review

[MINOR] go list -m all includes indirect/transitive dependencies. If yaml.v3 or go-cmp pull in their own transitive deps, those would appear here and trigger false-positive violations. Consider using go mod edit -json | jq '.Require[] | select(.Indirect == false) | .Path' or filtering with go list -m -mod=mod -f '{{if not .Indirect}}{{.Path}}{{end}}' all to check only direct dependencies.

**[MINOR]** `go list -m all` includes indirect/transitive dependencies. If yaml.v3 or go-cmp pull in their own transitive deps, those would appear here and trigger false-positive violations. Consider using `go mod edit -json | jq '.Require[] | select(.Indirect == false) | .Path'` or filtering with `go list -m -mod=mod -f '{{if not .Indirect}}{{.Path}}{{end}}' all` to check only direct dependencies.
if [ -n "$pkg" ] && [ "$pkg" != "Package" ] && [[ "$pkg" =~ ^[a-zA-Z] ]]; then
if [[ "$scope" == *"test"* ]]; then
ALLOWED_TEST["$pkg"]=1
else
Review

[NIT] Parsing the markdown table with grep/awk is somewhat brittle (e.g., relies on backticks around package names and exact column positions). This is acceptable given the documented process, but a brief note in CONVENTIONS.md to preserve formatting would help avoid accidental breakage.

**[NIT]** Parsing the markdown table with `grep`/`awk` is somewhat brittle (e.g., relies on backticks around package names and exact column positions). This is acceptable given the documented process, but a brief note in CONVENTIONS.md to preserve formatting would help avoid accidental breakage.
ALLOWED_PROD["$pkg"]=1
fi
Outdated
Review

[MINOR] The -oP flag for PCRE-style regex (\K) is a GNU grep extension and is not available on macOS (BSD grep). This will silently fail to parse the allowlist on macOS developer machines (the || true swallows the error), leaving ALLOWED empty and triggering the warning but not enforcing anything. Consider using sed instead: echo "$line" | sed -n "s/^| \([^`])`./\1/p"`, or document that GNU grep is required.

**[MINOR]** The `-oP` flag for PCRE-style regex (`\K`) is a GNU grep extension and is not available on macOS (BSD grep). This will silently fail to parse the allowlist on macOS developer machines (the `|| true` swallows the error), leaving ALLOWED empty and triggering the warning but not enforcing anything. Consider using `sed` instead: `echo "$line" | sed -n "s/^| \`\([^\`]*\)\`.*/\1/p"`, or document that GNU grep is required.
fi
Review

[MINOR] The filter [[ "$pkg" =~ ^[a-zA-Z] ]] rejects valid import paths that begin with a digit (e.g., 9fans.net/go). Consider relaxing to ^[[:alnum:]] or removing the check, since the header row is already excluded by the grep.

**[MINOR]** The filter `[[ "$pkg" =~ ^[a-zA-Z] ]]` rejects valid import paths that begin with a digit (e.g., 9fans.net/go). Consider relaxing to `^[[:alnum:]]` or removing the check, since the header row is already excluded by the grep.
done < <(grep '| `' "$CONVENTIONS_FILE" 2>/dev/null || true)
ALL_ALLOWED=("${!ALLOWED_PROD[@]}" "${!ALLOWED_TEST[@]}")
if [ ${#ALL_ALLOWED[@]} -eq 0 ]; then
echo "⚠️ No approved packages found in $CONVENTIONS_FILE"
echo " (This is fine if you want stdlib-only)"
fi
Outdated
Review

[MINOR] go list -m -f '{{if not .Indirect}}{{.Path}}{{end}}' all lists module-level direct dependencies, not package-level imports. This means a module in go.mod that isn't actually imported in any Go source file will still be flagged. More precisely, it will miss the case where someone adds a go.mod entry for an unapproved module but the script will catch modules not in go.mod at all (since they wouldn't be listed). This is probably fine for the intended purpose, but the comment 'DIRECT dependencies only' may mislead — it means direct in the go.mod sense, not 'actually used in source'. Consider a brief clarifying comment.

**[MINOR]** `go list -m -f '{{if not .Indirect}}{{.Path}}{{end}}' all` lists module-level direct dependencies, not package-level imports. This means a module in go.mod that isn't actually imported in any Go source file will still be flagged. More precisely, it will miss the case where someone adds a `go.mod` entry for an unapproved module but the script will catch modules not in go.mod at all (since they wouldn't be listed). This is probably fine for the intended purpose, but the comment 'DIRECT dependencies only' may mislead — it means direct in the go.mod sense, not 'actually used in source'. Consider a brief clarifying comment.
# Helper: check if import matches any package in an associative array (literal prefix, no glob)
matches_allowlist() {
local import="$1"
shift
local -n allowlist=$1
for allowed in "${!allowlist[@]}"; do
Outdated
Review

[NIT] echo -e is not POSIX-compliant and behavior varies across shells (e.g., on some systems /bin/bash may not support -e with printf semantics). The \n in the violations string also won't render on all platforms. Consider using printf instead: printf 'The following imports are not in the allowlist:%b\n' "$VIOLATIONS".

**[NIT]** `echo -e` is not POSIX-compliant and behavior varies across shells (e.g., on some systems /bin/bash may not support -e with printf semantics). The `\n` in the violations string also won't render on all platforms. Consider using `printf` instead: `printf 'The following imports are not in the allowlist:%b\n' "$VIOLATIONS"`.
# Exact match
if [ "$import" = "$allowed" ]; then
return 0
Outdated
Review

[MINOR] The prefix match uses , which treats the right side as a glob pattern. While allowed entries are controlled via review, a metacharacter (e.g., *, ?, [) in an approved package could unintentionally broaden matches. Consider a literal prefix check to avoid glob semantics.

**[MINOR]** The prefix match uses [[ "$import" == "$allowed/"* ]], which treats the right side as a glob pattern. While allowed entries are controlled via review, a metacharacter (e.g., *, ?, [) in an approved package could unintentionally broaden matches. Consider a literal prefix check to avoid glob semantics.
fi
# Literal prefix match for subpackages (no glob interpretation)
if [ "${import#"$allowed/"}" != "$import" ]; then
Review

[NIT] The comment says 'POSIX-compatible' for the awk parsing but the outer loop uses a Bash process substitution < <(...) which is Bash-specific. The comment is mildly misleading — it means awk itself is POSIX, not the overall approach.

**[NIT]** The comment says 'POSIX-compatible' for the awk parsing but the outer loop uses a Bash process substitution `< <(...)` which is Bash-specific. The comment is mildly misleading — it means awk itself is POSIX, not the overall approach.
return 0
fi
done
return 1
}
# Get DIRECT dependencies only (exclude indirect/transitive)
# Fail closed: if go list fails, we exit non-zero
Outdated
Review

[MAJOR] The script checks only direct non-test modules via go list -m ... all and never verifies test-only dependencies against the allowlist. As a result, importing any new third-party package in tests (e.g., in *_test.go) will not be flagged unless also used in production code. This violates the documented 'STRICT ALLOWLIST' policy for tests.

**[MAJOR]** The script checks only direct non-test modules via `go list -m ... all` and never verifies test-only dependencies against the allowlist. As a result, importing any new third-party package in tests (e.g., in *_test.go) will not be flagged unless also used in production code. This violates the documented 'STRICT ALLOWLIST' policy for tests.
IMPORTS=$(go list -m -f '{{if not .Indirect}}{{.Path}}{{end}}' all 2>&1) || {
3
@@ -44,21 +76,13 @@ if [ -z "$IMPORTS" ]; then
exit 0
fi
# Check all direct dependencies are in the allowlist
VIOLATIONS=""
while IFS= read -r import; do
[ -z "$import" ] && continue
# Check if import matches any allowed package (prefix match for subpackages)
MATCHED=false
for allowed in "${ALLOWED[@]}"; do
if [[ "$import" == "$allowed" ]] || [[ "$import" == "$allowed/"* ]]; then
MATCHED=true
break
fi
done
if [ "$MATCHED" = false ]; then
VIOLATIONS="${VIOLATIONS} - ${import}"$'\n'
if ! matches_allowlist "$import" ALLOWED_PROD && ! matches_allowlist "$import" ALLOWED_TEST; then
Outdated
Review

[NIT] The final line counts direct deps via wc -l on $IMPORTS, but $IMPORTS could contain blank lines depending on the go list output format when some template expansions produce empty strings. The grep -v '^$' earlier helps but the count may still be off by one if there's a trailing newline. Minor cosmetic issue.

**[NIT]** The final line counts direct deps via `wc -l` on `$IMPORTS`, but `$IMPORTS` could contain blank lines depending on the `go list` output format when some template expansions produce empty strings. The `grep -v '^$'` earlier helps but the count may still be off by one if there's a trailing newline. Minor cosmetic issue.
VIOLATIONS="${VIOLATIONS} - ${import} (not in allowlist)"$'\n'
fi
done <<< "$IMPORTS"
@@ -68,17 +92,31 @@ if [ -n "$VIOLATIONS" ]; then
echo "The following imports are not in the allowlist:"
printf "%s" "$VIOLATIONS"
echo ""
echo "Approved packages (from CONVENTIONS.md):"
for pkg in "${ALLOWED[@]}"; do
echo " - $pkg"
done
echo "To add a dependency, update CONVENTIONS.md (requires Aaron's approval)"
exit 1
fi
# Enforce Scope: test-only packages must not appear in non-test code
Review

[MINOR] The test-scope enforcement uses go list -deps -f '...' ./... | grep -v '_test' to identify production imports, but this approach is fragile: it filters by path containing '_test', which catches test packages but the -deps flag on non-test packages already excludes test files. More importantly, this will fail to detect a test-only package imported via a non-test file in a sub-package whose path happens not to contain '_test'. Using go list -f '{{if not .Standard}}{{.ImportPath}}{{end}}' $(go list ./...) (without -deps) and then checking against the allowlist directly would be more precise. That said, the current approach is a reasonable heuristic.

**[MINOR]** The test-scope enforcement uses `go list -deps -f '...' ./... | grep -v '_test'` to identify production imports, but this approach is fragile: it filters by path containing '_test', which catches test packages but the `-deps` flag on non-test packages already excludes test files. More importantly, this will fail to detect a test-only package imported via a non-test file in a sub-package whose path happens not to contain '_test'. Using `go list -f '{{if not .Standard}}{{.ImportPath}}{{end}}' $(go list ./...)` (without `-deps`) and then checking against the allowlist directly would be more precise. That said, the current approach is a reasonable heuristic.
Review

[NIT] The grep -v '_test' in the production imports step is unnecessary because go list -deps without -test already excludes test-only dependencies. While harmless, it can be removed for clarity.

**[NIT]** The `grep -v '_test'` in the production imports step is unnecessary because `go list -deps` without `-test` already excludes test-only dependencies. While harmless, it can be removed for clarity.
# Get imports used by non-test code only
PROD_IMPORTS=$(go list -deps -f '{{if not .Standard}}{{.ImportPath}}{{end}}' ./... 2>/dev/null | grep -v '_test' || true)
Review

[MINOR] Production imports discovery allows go list to fail open (2>/dev/null | ... || true). If go list fails, PROD_IMPORTS becomes empty and the script won’t detect misuse of test-only dependencies, weakening enforcement. Failing closed here would improve robustness.

**[MINOR]** Production imports discovery allows go list to fail open (2>/dev/null | ... || true). If go list fails, PROD_IMPORTS becomes empty and the script won’t detect misuse of test-only dependencies, weakening enforcement. Failing closed here would improve robustness.
TEST_ONLY_IN_PROD=""
Review

[MINOR] The scope enforcement uses grep -q "^${test_pkg}" which can produce false positives for modules with a similar prefix (e.g., github.com/google/go-cmppro would match github.com/google/go-cmp). Use a delimiter-aware pattern like ^${test_pkg}(/|$) to avoid accidental matches.

**[MINOR]** The scope enforcement uses `grep -q "^${test_pkg}"` which can produce false positives for modules with a similar prefix (e.g., github.com/google/go-cmppro would match github.com/google/go-cmp). Use a delimiter-aware pattern like `^${test_pkg}(/|$)` to avoid accidental matches.
for test_pkg in "${!ALLOWED_TEST[@]}"; do
Review

[MINOR] The regex "^${test_pkg}(/|\$|$)" has a redundant $\$ escapes the literal dollar sign for the shell but in the regex context inside grep -qE, the trailing |$) is a proper end-of-line anchor followed by the closing group, making \$ (literal $ character) unreachable in practice. This is harmless but the intent (match exact package or subpackage) is already handled by the ^pkg($|/) pattern. Consider simplifying to "^${test_pkg}(/|$)".

**[MINOR]** The regex `"^${test_pkg}(/|\$|$)"` has a redundant `$` — `\$` escapes the literal dollar sign for the shell but in the regex context inside `grep -qE`, the trailing `|$)` is a proper end-of-line anchor followed by the closing group, making `\$` (literal `$` character) unreachable in practice. This is harmless but the intent (match exact package or subpackage) is already handled by the `^pkg($|/)` pattern. Consider simplifying to `"^${test_pkg}(/|$)"`.
if echo "$PROD_IMPORTS" | grep -q "^${test_pkg}"; then
Outdated
Review

[MINOR] The grep check for test-only packages in production uses an unescaped regex pattern (grep -q "^${test_pkg}"). If a package name contains regex metacharacters (e.g., dots), this may match unintended imports. While not a code execution risk, it can cause false positives/negatives in enforcement.

**[MINOR]** The grep check for test-only packages in production uses an unescaped regex pattern (grep -q "^${test_pkg}"). If a package name contains regex metacharacters (e.g., dots), this may match unintended imports. While not a code execution risk, it can cause false positives/negatives in enforcement.
TEST_ONLY_IN_PROD="${TEST_ONLY_IN_PROD} - ${test_pkg} (marked 'test only' but used in production code)"$'\n'
fi
done
Review

[MINOR] The test-only production check uses grep -q "^${test_pkg}" without anchoring the end of the pattern. A package path like github.com/google/go-cmp would inadvertently match a hypothetical github.com/google/go-cmp-extended. Should be grep -qE "^${test_pkg}(/|$)" to match only exact or subpackage imports.

**[MINOR]** The test-only production check uses `grep -q "^${test_pkg}"` without anchoring the end of the pattern. A package path like `github.com/google/go-cmp` would inadvertently match a hypothetical `github.com/google/go-cmp-extended`. Should be `grep -qE "^${test_pkg}(/|$)"` to match only exact or subpackage imports.
Review

[MINOR] The grep regex ^${test_pkg}(/|\$|$) redundantly includes both \$ and $. It can be simplified to ^${test_pkg}(/|$) without changing semantics.

**[MINOR]** The grep regex `^${test_pkg}(/|\$|$)` redundantly includes both `\$` and `$`. It can be simplified to `^${test_pkg}(/|$)` without changing semantics.
if [ -n "$TEST_ONLY_IN_PROD" ]; then
echo "❌ TEST-ONLY DEPENDENCIES IN PRODUCTION CODE"
echo ""
echo "To add a dependency:"
echo " 1. Open a PR that ONLY updates CONVENTIONS.md"
echo " 2. Get explicit approval from Aaron"
echo " 3. After merge, use the package in a separate PR"
printf "%s" "$TEST_ONLY_IN_PROD"
echo ""
echo "These packages are marked 'test only' in CONVENTIONS.md"
echo "and must only be imported from *_test.go files."
exit 1
fi
echo "✅ All dependencies are approved"
echo " Direct deps: $(echo "$IMPORTS" | wc -l | tr -d ' ')"
echo " Production: ${#ALLOWED_PROD[@]}, Test-only: ${#ALLOWED_TEST[@]}"