docs: allow approved third-party packages #59

Merged
rodin merged 4 commits from allow-deps into main 2026-05-10 21:07:10 +00:00
Showing only changes of commit 01cde16d47 - Show all commits
+25 -20
View File
@@ -1,12 +1,21 @@
#!/bin/bash #!/usr/bin/env bash
# check-deps.sh - Enforces the strict dependency allowlist from CONVENTIONS.md # check-deps.sh - Enforces the strict dependency allowlist from CONVENTIONS.md
# Exit 1 if any unapproved import is found. # Exit 1 if any unapproved import is found.
#
# Requires: Bash 4+ (for associative arrays), Go toolchain
# #
# The allowlist is parsed from CONVENTIONS.md to maintain a single source of truth. # The allowlist is parsed from CONVENTIONS.md to maintain a single source of truth.
# Also enforces Scope column: "test only" packages cannot appear in non-test code. # Enforces Scope column: "test only" packages cannot appear in non-test code.
set -euo pipefail set -euo pipefail
# Check bash version
if ((BASH_VERSINFO[0] < 4)); then
echo "❌ Bash 4+ required (found ${BASH_VERSION})"
echo " On macOS: brew install bash"
exit 1
fi
CONVENTIONS_FILE="${1:-CONVENTIONS.md}" CONVENTIONS_FILE="${1:-CONVENTIONS.md}"
if [ ! -f "$CONVENTIONS_FILE" ]; then if [ ! -f "$CONVENTIONS_FILE" ]; then
1
@@ -16,13 +25,11 @@ fi
# Parse approved packages from CONVENTIONS.md table using awk (POSIX-compatible) # Parse approved packages from CONVENTIONS.md table using awk (POSIX-compatible)
# Format: | `package` | use case | scope | # Format: | `package` | use case | scope |
# Output: package:scope (e.g., "gopkg.in/yaml.v3:production")
declare -A ALLOWED_PROD=() declare -A ALLOWED_PROD=()
declare -A ALLOWED_TEST=() declare -A ALLOWED_TEST=()
while IFS= read -r line; do while IFS= read -r line; do
# Use awk to extract package and scope from table row # Use awk to extract package and scope from table row
Review

[NIT] Parsing the markdown table with grep/awk is somewhat brittle (e.g., relies on backticks around package names and exact column positions). This is acceptable given the documented process, but a brief note in CONVENTIONS.md to preserve formatting would help avoid accidental breakage.

**[NIT]** Parsing the markdown table with `grep`/`awk` is somewhat brittle (e.g., relies on backticks around package names and exact column positions). This is acceptable given the documented process, but a brief note in CONVENTIONS.md to preserve formatting would help avoid accidental breakage.
# Split on | and extract backtick-wrapped package from column 1, scope from column 3
pkg=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]*`|`[[:space:]]*$/, "", $2); print $2}') pkg=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]*`|`[[:space:]]*$/, "", $2); print $2}')
scope=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]+|[[:space:]]+$/, "", $4); print tolower($4)}') scope=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]+|[[:space:]]+$/, "", $4); print tolower($4)}')
Review

[MINOR] The filter [[ "$pkg" =~ ^[a-zA-Z] ]] rejects valid import paths that begin with a digit (e.g., 9fans.net/go). Consider relaxing to ^[[:alnum:]] or removing the check, since the header row is already excluded by the grep.

**[MINOR]** The filter `[[ "$pkg" =~ ^[a-zA-Z] ]]` rejects valid import paths that begin with a digit (e.g., 9fans.net/go). Consider relaxing to `^[[:alnum:]]` or removing the check, since the header row is already excluded by the grep.
1
@@ -53,7 +60,7 @@ matches_allowlist() {
if [ "$import" = "$allowed" ]; then if [ "$import" = "$allowed" ]; then
return 0 return 0
fi fi
# Literal prefix match for subpackages (no glob interpretation) # Literal prefix match for subpackages: must match "pkg/" exactly
if [ "${import#"$allowed/"}" != "$import" ]; then if [ "${import#"$allowed/"}" != "$import" ]; then
return 0 return 0
fi fi
@@ -61,22 +68,19 @@ matches_allowlist() {
return 1 return 1
} }
# Get DIRECT dependencies only (exclude indirect/transitive) # Get direct module dependencies from go.mod
Review

[MINOR] If go list returns non-module output (e.g. build errors) the error is captured in DIRECT_IMPORTS and the early-exit error message will contain the raw go toolchain output. This works adequately but the error message could be confusing. Minor quality-of-life issue.

**[MINOR]** If `go list` returns non-module output (e.g. build errors) the error is captured in `DIRECT_IMPORTS` and the early-exit error message will contain the raw go toolchain output. This works adequately but the error message could be confusing. Minor quality-of-life issue.
# Fail closed: if go list fails, we exit non-zero DIRECT_IMPORTS=$(go list -m -f '{{if and (not .Indirect) (not .Main)}}{{.Path}}{{end}}' all 2>&1) || {
IMPORTS=$(go list -m -f '{{if not .Indirect}}{{.Path}}{{end}}' all 2>&1) || { echo "❌ Failed to list dependencies: $DIRECT_IMPORTS"
Review

[MINOR] The script checks go.mod direct dependencies for allowlist compliance, but the scope enforcement (lines 97-111) checks the full production import graph via go list -deps. These two checks are at different granularities and could diverge. For example, a direct dependency might be test-only in practice but go list -deps ./... (which doesn't filter test files) would still traverse it. Using go list -deps -test=false ./... more explicitly conveys intent, though the current flag-less form already excludes test builds.

**[MINOR]** The script checks `go.mod` direct dependencies for allowlist compliance, but the scope enforcement (lines 97-111) checks the full production import graph via `go list -deps`. These two checks are at different granularities and could diverge. For example, a direct dependency might be test-only in practice but `go list -deps ./...` (which doesn't filter test files) would still traverse it. Using `go list -deps -test=false ./...` more explicitly conveys intent, though the current flag-less form already excludes test builds.
echo "❌ Failed to list dependencies: $IMPORTS"
exit 1 exit 1
} }
DIRECT_IMPORTS=$(echo "$DIRECT_IMPORTS" | grep -v '^$' || true)
# Filter out the module itself (first line) and empty lines if [ -z "$DIRECT_IMPORTS" ]; then
IMPORTS=$(echo "$IMPORTS" | tail -n +2 | grep -v '^$' || true)
if [ -z "$IMPORTS" ]; then
echo "✅ No external dependencies" echo "✅ No external dependencies"
exit 0 exit 0
fi fi
# Check all direct dependencies are in the allowlist # Check ALL direct dependencies are in some allowlist
VIOLATIONS="" VIOLATIONS=""
while IFS= read -r import; do while IFS= read -r import; do
[ -z "$import" ] && continue [ -z "$import" ] && continue
@@ -84,7 +88,7 @@ while IFS= read -r import; do
if ! matches_allowlist "$import" ALLOWED_PROD && ! matches_allowlist "$import" ALLOWED_TEST; then if ! matches_allowlist "$import" ALLOWED_PROD && ! matches_allowlist "$import" ALLOWED_TEST; then
VIOLATIONS="${VIOLATIONS} - ${import} (not in allowlist)"$'\n' VIOLATIONS="${VIOLATIONS} - ${import} (not in allowlist)"$'\n'
fi fi
done <<< "$IMPORTS" done <<< "$DIRECT_IMPORTS"
if [ -n "$VIOLATIONS" ]; then if [ -n "$VIOLATIONS" ]; then
echo "❌ UNAPPROVED DEPENDENCIES DETECTED" echo "❌ UNAPPROVED DEPENDENCIES DETECTED"
2
@@ -97,12 +101,13 @@ if [ -n "$VIOLATIONS" ]; then
fi fi
Review

[MINOR] Production imports discovery allows go list to fail open (2>/dev/null | ... || true). If go list fails, PROD_IMPORTS becomes empty and the script won’t detect misuse of test-only dependencies, weakening enforcement. Failing closed here would improve robustness.

**[MINOR]** Production imports discovery allows go list to fail open (2>/dev/null | ... || true). If go list fails, PROD_IMPORTS becomes empty and the script won’t detect misuse of test-only dependencies, weakening enforcement. Failing closed here would improve robustness.
# Enforce Scope: test-only packages must not appear in non-test code # Enforce Scope: test-only packages must not appear in non-test code
Review

[MINOR] The scope enforcement uses grep -q "^${test_pkg}" which can produce false positives for modules with a similar prefix (e.g., github.com/google/go-cmppro would match github.com/google/go-cmp). Use a delimiter-aware pattern like ^${test_pkg}(/|$) to avoid accidental matches.

**[MINOR]** The scope enforcement uses `grep -q "^${test_pkg}"` which can produce false positives for modules with a similar prefix (e.g., github.com/google/go-cmppro would match github.com/google/go-cmp). Use a delimiter-aware pattern like `^${test_pkg}(/|$)` to avoid accidental matches.
# Get imports used by non-test code only # Get imports used by non-test code only (go list -deps without -test excludes test deps)
Review

[MINOR] The regex "^${test_pkg}(/|\$|$)" has a redundant $\$ escapes the literal dollar sign for the shell but in the regex context inside grep -qE, the trailing |$) is a proper end-of-line anchor followed by the closing group, making \$ (literal $ character) unreachable in practice. This is harmless but the intent (match exact package or subpackage) is already handled by the ^pkg($|/) pattern. Consider simplifying to "^${test_pkg}(/|$)".

**[MINOR]** The regex `"^${test_pkg}(/|\$|$)"` has a redundant `$` — `\$` escapes the literal dollar sign for the shell but in the regex context inside `grep -qE`, the trailing `|$)` is a proper end-of-line anchor followed by the closing group, making `\$` (literal `$` character) unreachable in practice. This is harmless but the intent (match exact package or subpackage) is already handled by the `^pkg($|/)` pattern. Consider simplifying to `"^${test_pkg}(/|$)"`.
PROD_IMPORTS=$(go list -deps -f '{{if not .Standard}}{{.ImportPath}}{{end}}' ./... 2>/dev/null | grep -v '_test' || true) PROD_IMPORTS=$(go list -deps -f '{{if not .Standard}}{{.ImportPath}}{{end}}' ./... 2>/dev/null || true)
TEST_ONLY_IN_PROD="" TEST_ONLY_IN_PROD=""
for test_pkg in "${!ALLOWED_TEST[@]}"; do for test_pkg in "${!ALLOWED_TEST[@]}"; do
Review

[MINOR] The test-only production check uses grep -q "^${test_pkg}" without anchoring the end of the pattern. A package path like github.com/google/go-cmp would inadvertently match a hypothetical github.com/google/go-cmp-extended. Should be grep -qE "^${test_pkg}(/|$)" to match only exact or subpackage imports.

**[MINOR]** The test-only production check uses `grep -q "^${test_pkg}"` without anchoring the end of the pattern. A package path like `github.com/google/go-cmp` would inadvertently match a hypothetical `github.com/google/go-cmp-extended`. Should be `grep -qE "^${test_pkg}(/|$)"` to match only exact or subpackage imports.
if echo "$PROD_IMPORTS" | grep -q "^${test_pkg}"; then # Use word-boundary matching: exact match or followed by /
Review

[MINOR] The grep regex ^${test_pkg}(/|\$|$) redundantly includes both \$ and $. It can be simplified to ^${test_pkg}(/|$) without changing semantics.

**[MINOR]** The grep regex `^${test_pkg}(/|\$|$)` redundantly includes both `\$` and `$`. It can be simplified to `^${test_pkg}(/|$)` without changing semantics.
if echo "$PROD_IMPORTS" | grep -qE "^${test_pkg}(/|\$|$)"; then
TEST_ONLY_IN_PROD="${TEST_ONLY_IN_PROD} - ${test_pkg} (marked 'test only' but used in production code)"$'\n' TEST_ONLY_IN_PROD="${TEST_ONLY_IN_PROD} - ${test_pkg} (marked 'test only' but used in production code)"$'\n'
fi fi
done done
@@ -118,5 +123,5 @@ if [ -n "$TEST_ONLY_IN_PROD" ]; then
fi fi
echo "✅ All dependencies are approved" echo "✅ All dependencies are approved"
echo " Direct deps: $(echo "$IMPORTS" | wc -l | tr -d ' ')" echo " Direct module deps: $(echo "$DIRECT_IMPORTS" | wc -l | tr -d ' ')"
echo " Production: ${#ALLOWED_PROD[@]}, Test-only: ${#ALLOWED_TEST[@]}" echo " Production allowlist: ${#ALLOWED_PROD[@]}, Test-only allowlist: ${#ALLOWED_TEST[@]}"