docs: allow approved third-party packages #59

Merged
rodin merged 4 commits from allow-deps into main 2026-05-10 21:07:10 +00:00
Showing only changes of commit 01cde16d47 - Show all commits
+25 -20
View File
@@ -1,12 +1,21 @@
#!/bin/bash
#!/usr/bin/env bash
# check-deps.sh - Enforces the strict dependency allowlist from CONVENTIONS.md
# Exit 1 if any unapproved import is found.
#
# Requires: Bash 4+ (for associative arrays), Go toolchain
#
# The allowlist is parsed from CONVENTIONS.md to maintain a single source of truth.
# Also enforces Scope column: "test only" packages cannot appear in non-test code.
# Enforces Scope column: "test only" packages cannot appear in non-test code.
Outdated
Review

[MINOR] The allowlist is duplicated between CONVENTIONS.md and this script (ALLOWED array), creating a risk of drift. A mismatch could either block approved deps or unintentionally allow unapproved ones, weakening policy enforcement. Consider deriving the list from a single source (e.g., parse CONVENTIONS.md or define the list in a single machine-readable file).

**[MINOR]** The allowlist is duplicated between CONVENTIONS.md and this script (`ALLOWED` array), creating a risk of drift. A mismatch could either block approved deps or unintentionally allow unapproved ones, weakening policy enforcement. Consider deriving the list from a single source (e.g., parse CONVENTIONS.md or define the list in a single machine-readable file).
Outdated
Review

[MAJOR] The allowlist is hard-coded in the script (ALLOWED array) while CONVENTIONS.md is presented as the source of truth. The documented process says to open a PR that ONLY updates the table, but the enforcement will still fail later unless this script is updated too. This mismatch will cause policy drift and block subsequent PRs using newly approved dependencies.

**[MAJOR]** The allowlist is hard-coded in the script (ALLOWED array) while CONVENTIONS.md is presented as the source of truth. The documented process says to open a PR that ONLY updates the table, but the enforcement will still fail later unless this script is updated too. This mismatch will cause policy drift and block subsequent PRs using newly approved dependencies.
set -euo pipefail
# Check bash version
if ((BASH_VERSINFO[0] < 4)); then
echo "❌ Bash 4+ required (found ${BASH_VERSION})"
Outdated
Review

[MINOR] The ALLOWED_PATTERN variable is built but never used. The actual checking is done in the loop with direct string comparison against the ALLOWED array. The dead code is harmless but confusing — either use the pattern (e.g., with grep) or remove it.

**[MINOR]** The ALLOWED_PATTERN variable is built but never used. The actual checking is done in the loop with direct string comparison against the ALLOWED array. The dead code is harmless but confusing — either use the pattern (e.g., with grep) or remove it.
echo " On macOS: brew install bash"
Outdated
Review

[NIT] ALLOWED_PATTERN is built but never used. Remove this dead code to reduce confusion and keep the script minimal.

**[NIT]** ALLOWED_PATTERN is built but never used. Remove this dead code to reduce confusion and keep the script minimal.
exit 1
fi
CONVENTIONS_FILE="${1:-CONVENTIONS.md}"
Outdated
Review

[MAJOR] The script does not enforce the 'Scope' column from CONVENTIONS.md (e.g., allowing github.com/google/go-cmp only in tests). As written, any listed package is allowed in production code, contradicting the documented policy.

**[MAJOR]** The script does not enforce the 'Scope' column from CONVENTIONS.md (e.g., allowing github.com/google/go-cmp only in tests). As written, any listed package is allowed in production code, contradicting the documented policy.
Outdated
Review

[MINOR] The script relies on bash associative arrays (declare -A), which require Bash 4+. On macOS (default Bash 3.2), this will fail locally for contributors running make precommit. Consider documenting the requirement, invoking a known shell (e.g., via env bash with ensured version), or rewriting to avoid associative arrays.

**[MINOR]** The script relies on bash associative arrays (`declare -A`), which require Bash 4+. On macOS (default Bash 3.2), this will fail locally for contributors running `make precommit`. Consider documenting the requirement, invoking a known shell (e.g., via `env bash` with ensured version), or rewriting to avoid associative arrays.
if [ ! -f "$CONVENTIONS_FILE" ]; then
3
@@ -16,13 +25,11 @@ fi
# Parse approved packages from CONVENTIONS.md table using awk (POSIX-compatible)
Outdated
Review

[MINOR] Table parsing relies on a brittle grep pattern ('^| `[a-zA-Z]') and a single-code-cell extraction. Future formatting changes (leading spaces, additional code spans) or package names starting with non-letters may cause false negatives/positives.

**[MINOR]** Table parsing relies on a brittle grep pattern ('^\| `[a-zA-Z]') and a single-code-cell extraction. Future formatting changes (leading spaces, additional code spans) or package names starting with non-letters may cause false negatives/positives.
# Format: | `package` | use case | scope |
Outdated
Review

[MINOR] go list -m all includes indirect/transitive dependencies. If yaml.v3 or go-cmp pull in their own transitive deps, those would appear here and trigger false-positive violations. Consider using go mod edit -json | jq '.Require[] | select(.Indirect == false) | .Path' or filtering with go list -m -mod=mod -f '{{if not .Indirect}}{{.Path}}{{end}}' all to check only direct dependencies.

**[MINOR]** `go list -m all` includes indirect/transitive dependencies. If yaml.v3 or go-cmp pull in their own transitive deps, those would appear here and trigger false-positive violations. Consider using `go mod edit -json | jq '.Require[] | select(.Indirect == false) | .Path'` or filtering with `go list -m -mod=mod -f '{{if not .Indirect}}{{.Path}}{{end}}' all` to check only direct dependencies.
# Output: package:scope (e.g., "gopkg.in/yaml.v3:production")
declare -A ALLOWED_PROD=()
declare -A ALLOWED_TEST=()
while IFS= read -r line; do
# Use awk to extract package and scope from table row
Review

[NIT] Parsing the markdown table with grep/awk is somewhat brittle (e.g., relies on backticks around package names and exact column positions). This is acceptable given the documented process, but a brief note in CONVENTIONS.md to preserve formatting would help avoid accidental breakage.

**[NIT]** Parsing the markdown table with `grep`/`awk` is somewhat brittle (e.g., relies on backticks around package names and exact column positions). This is acceptable given the documented process, but a brief note in CONVENTIONS.md to preserve formatting would help avoid accidental breakage.
# Split on | and extract backtick-wrapped package from column 1, scope from column 3
pkg=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]*`|`[[:space:]]*$/, "", $2); print $2}')
scope=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]+|[[:space:]]+$/, "", $4); print tolower($4)}')
Outdated
Review

[MINOR] The -oP flag for PCRE-style regex (\K) is a GNU grep extension and is not available on macOS (BSD grep). This will silently fail to parse the allowlist on macOS developer machines (the || true swallows the error), leaving ALLOWED empty and triggering the warning but not enforcing anything. Consider using sed instead: echo "$line" | sed -n "s/^| \([^`])`./\1/p"`, or document that GNU grep is required.

**[MINOR]** The `-oP` flag for PCRE-style regex (`\K`) is a GNU grep extension and is not available on macOS (BSD grep). This will silently fail to parse the allowlist on macOS developer machines (the `|| true` swallows the error), leaving ALLOWED empty and triggering the warning but not enforcing anything. Consider using `sed` instead: `echo "$line" | sed -n "s/^| \`\([^\`]*\)\`.*/\1/p"`, or document that GNU grep is required.
Review

[MINOR] The filter [[ "$pkg" =~ ^[a-zA-Z] ]] rejects valid import paths that begin with a digit (e.g., 9fans.net/go). Consider relaxing to ^[[:alnum:]] or removing the check, since the header row is already excluded by the grep.

**[MINOR]** The filter `[[ "$pkg" =~ ^[a-zA-Z] ]]` rejects valid import paths that begin with a digit (e.g., 9fans.net/go). Consider relaxing to `^[[:alnum:]]` or removing the check, since the header row is already excluded by the grep.
4
@@ -53,7 +60,7 @@ matches_allowlist() {
if [ "$import" = "$allowed" ]; then
return 0
fi
# Literal prefix match for subpackages (no glob interpretation)
# Literal prefix match for subpackages: must match "pkg/" exactly
if [ "${import#"$allowed/"}" != "$import" ]; then
return 0
Outdated
Review

[MAJOR] The script checks only direct non-test modules via go list -m ... all and never verifies test-only dependencies against the allowlist. As a result, importing any new third-party package in tests (e.g., in *_test.go) will not be flagged unless also used in production code. This violates the documented 'STRICT ALLOWLIST' policy for tests.

**[MAJOR]** The script checks only direct non-test modules via `go list -m ... all` and never verifies test-only dependencies against the allowlist. As a result, importing any new third-party package in tests (e.g., in *_test.go) will not be flagged unless also used in production code. This violates the documented 'STRICT ALLOWLIST' policy for tests.
fi
@@ -61,22 +68,19 @@ matches_allowlist() {
return 1
Outdated
Review

[MINOR] The go list -m output includes the current module itself on the first line, and the comment says 'Filter out the module itself (first line)'. However, tail -n +2 strips the first line unconditionally. If go list -m ... all ever reorders output or includes a blank first line (e.g., when run in a workspace), this assumption breaks silently. A more robust approach would be to filter by comparing against $(go list -m) (the current module path) rather than relying on line position.

**[MINOR]** The `go list -m` output includes the current module itself on the first line, and the comment says 'Filter out the module itself (first line)'. However, `tail -n +2` strips the first line unconditionally. If `go list -m ... all` ever reorders output or includes a blank first line (e.g., when run in a workspace), this assumption breaks silently. A more robust approach would be to filter by comparing against `$(go list -m)` (the current module path) rather than relying on line position.
}
# Get DIRECT dependencies only (exclude indirect/transitive)
# Fail closed: if go list fails, we exit non-zero
IMPORTS=$(go list -m -f '{{if not .Indirect}}{{.Path}}{{end}}' all 2>&1) || {
echo "❌ Failed to list dependencies: $IMPORTS"
# Get direct module dependencies from go.mod
Review

[MINOR] If go list returns non-module output (e.g. build errors) the error is captured in DIRECT_IMPORTS and the early-exit error message will contain the raw go toolchain output. This works adequately but the error message could be confusing. Minor quality-of-life issue.

**[MINOR]** If `go list` returns non-module output (e.g. build errors) the error is captured in `DIRECT_IMPORTS` and the early-exit error message will contain the raw go toolchain output. This works adequately but the error message could be confusing. Minor quality-of-life issue.
DIRECT_IMPORTS=$(go list -m -f '{{if and (not .Indirect) (not .Main)}}{{.Path}}{{end}}' all 2>&1) || {
echo "❌ Failed to list dependencies: $DIRECT_IMPORTS"
Review

[MINOR] The script checks go.mod direct dependencies for allowlist compliance, but the scope enforcement (lines 97-111) checks the full production import graph via go list -deps. These two checks are at different granularities and could diverge. For example, a direct dependency might be test-only in practice but go list -deps ./... (which doesn't filter test files) would still traverse it. Using go list -deps -test=false ./... more explicitly conveys intent, though the current flag-less form already excludes test builds.

**[MINOR]** The script checks `go.mod` direct dependencies for allowlist compliance, but the scope enforcement (lines 97-111) checks the full production import graph via `go list -deps`. These two checks are at different granularities and could diverge. For example, a direct dependency might be test-only in practice but `go list -deps ./...` (which doesn't filter test files) would still traverse it. Using `go list -deps -test=false ./...` more explicitly conveys intent, though the current flag-less form already excludes test builds.
exit 1
}
DIRECT_IMPORTS=$(echo "$DIRECT_IMPORTS" | grep -v '^$' || true)
# Filter out the module itself (first line) and empty lines
IMPORTS=$(echo "$IMPORTS" | tail -n +2 | grep -v '^$' || true)
if [ -z "$IMPORTS" ]; then
if [ -z "$DIRECT_IMPORTS" ]; then
echo "✅ No external dependencies"
exit 0
fi
# Check all direct dependencies are in the allowlist
# Check ALL direct dependencies are in some allowlist
VIOLATIONS=""
Outdated
Review

[NIT] The final line counts direct deps via wc -l on $IMPORTS, but $IMPORTS could contain blank lines depending on the go list output format when some template expansions produce empty strings. The grep -v '^$' earlier helps but the count may still be off by one if there's a trailing newline. Minor cosmetic issue.

**[NIT]** The final line counts direct deps via `wc -l` on `$IMPORTS`, but `$IMPORTS` could contain blank lines depending on the `go list` output format when some template expansions produce empty strings. The `grep -v '^$'` earlier helps but the count may still be off by one if there's a trailing newline. Minor cosmetic issue.
while IFS= read -r import; do
[ -z "$import" ] && continue
@@ -84,7 +88,7 @@ while IFS= read -r import; do
if ! matches_allowlist "$import" ALLOWED_PROD && ! matches_allowlist "$import" ALLOWED_TEST; then
VIOLATIONS="${VIOLATIONS} - ${import} (not in allowlist)"$'\n'
fi
done <<< "$IMPORTS"
done <<< "$DIRECT_IMPORTS"
if [ -n "$VIOLATIONS" ]; then
echo "❌ UNAPPROVED DEPENDENCIES DETECTED"
2
@@ -97,12 +101,13 @@ if [ -n "$VIOLATIONS" ]; then
fi
Review

[MINOR] Production imports discovery allows go list to fail open (2>/dev/null | ... || true). If go list fails, PROD_IMPORTS becomes empty and the script won’t detect misuse of test-only dependencies, weakening enforcement. Failing closed here would improve robustness.

**[MINOR]** Production imports discovery allows go list to fail open (2>/dev/null | ... || true). If go list fails, PROD_IMPORTS becomes empty and the script won’t detect misuse of test-only dependencies, weakening enforcement. Failing closed here would improve robustness.
# Enforce Scope: test-only packages must not appear in non-test code
Review

[MINOR] The scope enforcement uses grep -q "^${test_pkg}" which can produce false positives for modules with a similar prefix (e.g., github.com/google/go-cmppro would match github.com/google/go-cmp). Use a delimiter-aware pattern like ^${test_pkg}(/|$) to avoid accidental matches.

**[MINOR]** The scope enforcement uses `grep -q "^${test_pkg}"` which can produce false positives for modules with a similar prefix (e.g., github.com/google/go-cmppro would match github.com/google/go-cmp). Use a delimiter-aware pattern like `^${test_pkg}(/|$)` to avoid accidental matches.
# Get imports used by non-test code only
PROD_IMPORTS=$(go list -deps -f '{{if not .Standard}}{{.ImportPath}}{{end}}' ./... 2>/dev/null | grep -v '_test' || true)
# Get imports used by non-test code only (go list -deps without -test excludes test deps)
Review

[MINOR] The regex "^${test_pkg}(/|\$|$)" has a redundant $\$ escapes the literal dollar sign for the shell but in the regex context inside grep -qE, the trailing |$) is a proper end-of-line anchor followed by the closing group, making \$ (literal $ character) unreachable in practice. This is harmless but the intent (match exact package or subpackage) is already handled by the ^pkg($|/) pattern. Consider simplifying to "^${test_pkg}(/|$)".

**[MINOR]** The regex `"^${test_pkg}(/|\$|$)"` has a redundant `$` — `\$` escapes the literal dollar sign for the shell but in the regex context inside `grep -qE`, the trailing `|$)` is a proper end-of-line anchor followed by the closing group, making `\$` (literal `$` character) unreachable in practice. This is harmless but the intent (match exact package or subpackage) is already handled by the `^pkg($|/)` pattern. Consider simplifying to `"^${test_pkg}(/|$)"`.
PROD_IMPORTS=$(go list -deps -f '{{if not .Standard}}{{.ImportPath}}{{end}}' ./... 2>/dev/null || true)
Outdated
Review

[MINOR] The grep check for test-only packages in production uses an unescaped regex pattern (grep -q "^${test_pkg}"). If a package name contains regex metacharacters (e.g., dots), this may match unintended imports. While not a code execution risk, it can cause false positives/negatives in enforcement.

**[MINOR]** The grep check for test-only packages in production uses an unescaped regex pattern (grep -q "^${test_pkg}"). If a package name contains regex metacharacters (e.g., dots), this may match unintended imports. While not a code execution risk, it can cause false positives/negatives in enforcement.
TEST_ONLY_IN_PROD=""
for test_pkg in "${!ALLOWED_TEST[@]}"; do
Review

[MINOR] The test-only production check uses grep -q "^${test_pkg}" without anchoring the end of the pattern. A package path like github.com/google/go-cmp would inadvertently match a hypothetical github.com/google/go-cmp-extended. Should be grep -qE "^${test_pkg}(/|$)" to match only exact or subpackage imports.

**[MINOR]** The test-only production check uses `grep -q "^${test_pkg}"` without anchoring the end of the pattern. A package path like `github.com/google/go-cmp` would inadvertently match a hypothetical `github.com/google/go-cmp-extended`. Should be `grep -qE "^${test_pkg}(/|$)"` to match only exact or subpackage imports.
if echo "$PROD_IMPORTS" | grep -q "^${test_pkg}"; then
# Use word-boundary matching: exact match or followed by /
Review

[MINOR] The grep regex ^${test_pkg}(/|\$|$) redundantly includes both \$ and $. It can be simplified to ^${test_pkg}(/|$) without changing semantics.

**[MINOR]** The grep regex `^${test_pkg}(/|\$|$)` redundantly includes both `\$` and `$`. It can be simplified to `^${test_pkg}(/|$)` without changing semantics.
if echo "$PROD_IMPORTS" | grep -qE "^${test_pkg}(/|\$|$)"; then
TEST_ONLY_IN_PROD="${TEST_ONLY_IN_PROD} - ${test_pkg} (marked 'test only' but used in production code)"$'\n'
fi
done
@@ -118,5 +123,5 @@ if [ -n "$TEST_ONLY_IN_PROD" ]; then
fi
echo "✅ All dependencies are approved"
echo " Direct deps: $(echo "$IMPORTS" | wc -l | tr -d ' ')"
echo " Production: ${#ALLOWED_PROD[@]}, Test-only: ${#ALLOWED_TEST[@]}"
echo " Direct module deps: $(echo "$DIRECT_IMPORTS" | wc -l | tr -d ' ')"
echo " Production allowlist: ${#ALLOWED_PROD[@]}, Test-only allowlist: ${#ALLOWED_TEST[@]}"