docs: allow approved third-party packages #59

Merged
rodin merged 4 commits from allow-deps into main 2026-05-10 21:07:10 +00:00
Showing only changes of commit aeb0c8cb79 - Show all commits
+68 -30
View File
@@ -3,6 +3,7 @@
# Exit 1 if any unapproved import is found. # Exit 1 if any unapproved import is found.
# #
# The allowlist is parsed from CONVENTIONS.md to maintain a single source of truth. # The allowlist is parsed from CONVENTIONS.md to maintain a single source of truth.
# Also enforces Scope column: "test only" packages cannot appear in non-test code.
set -euo pipefail set -euo pipefail
@@ -13,22 +14,53 @@ if [ ! -f "$CONVENTIONS_FILE" ]; then
exit 1 exit 1
fi fi
# Parse approved packages from CONVENTIONS.md table # Parse approved packages from CONVENTIONS.md table using awk (POSIX-compatible)
# Looks for lines like: | `gopkg.in/yaml.v3` | ... # Format: | `package` | use case | scope |
ALLOWED=() # Output: package:scope (e.g., "gopkg.in/yaml.v3:production")
while IFS= read -r line; do declare -A ALLOWED_PROD=()
# Extract package from markdown table cell: | `package` | declare -A ALLOWED_TEST=()
pkg=$(echo "$line" | grep -oP '\| `\K[^`]+' | head -1 || true)
if [ -n "$pkg" ] && [[ "$pkg" != "Package" ]]; then
ALLOWED+=("$pkg")
fi
done < <(grep -E '^\| `[a-zA-Z]' "$CONVENTIONS_FILE" || true)
if [ ${#ALLOWED[@]} -eq 0 ]; then while IFS= read -r line; do
Review

[MINOR] Uses 'grep -P' (Perl regex), which is not available on macOS/BSD grep by default. This reduces cross-platform developer usability for the precommit hook.

**[MINOR]** Uses 'grep -P' (Perl regex), which is not available on macOS/BSD grep by default. This reduces cross-platform developer usability for the precommit hook.
# Use awk to extract package and scope from table row
# Split on | and extract backtick-wrapped package from column 1, scope from column 3
pkg=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]*`|`[[:space:]]*$/, "", $2); print $2}')
scope=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]+|[[:space:]]+$/, "", $4); print tolower($4)}')
if [ -n "$pkg" ] && [ "$pkg" != "Package" ] && [[ "$pkg" =~ ^[a-zA-Z] ]]; then
if [[ "$scope" == *"test"* ]]; then
ALLOWED_TEST["$pkg"]=1
else
Review

[NIT] Parsing the markdown table with grep/awk is somewhat brittle (e.g., relies on backticks around package names and exact column positions). This is acceptable given the documented process, but a brief note in CONVENTIONS.md to preserve formatting would help avoid accidental breakage.

**[NIT]** Parsing the markdown table with `grep`/`awk` is somewhat brittle (e.g., relies on backticks around package names and exact column positions). This is acceptable given the documented process, but a brief note in CONVENTIONS.md to preserve formatting would help avoid accidental breakage.
ALLOWED_PROD["$pkg"]=1
fi
fi
Review

[MINOR] The filter [[ "$pkg" =~ ^[a-zA-Z] ]] rejects valid import paths that begin with a digit (e.g., 9fans.net/go). Consider relaxing to ^[[:alnum:]] or removing the check, since the header row is already excluded by the grep.

**[MINOR]** The filter `[[ "$pkg" =~ ^[a-zA-Z] ]]` rejects valid import paths that begin with a digit (e.g., 9fans.net/go). Consider relaxing to `^[[:alnum:]]` or removing the check, since the header row is already excluded by the grep.
done < <(grep '| `' "$CONVENTIONS_FILE" 2>/dev/null || true)
ALL_ALLOWED=("${!ALLOWED_PROD[@]}" "${!ALLOWED_TEST[@]}")
if [ ${#ALL_ALLOWED[@]} -eq 0 ]; then
echo "⚠️ No approved packages found in $CONVENTIONS_FILE" echo "⚠️ No approved packages found in $CONVENTIONS_FILE"
echo " (This is fine if you want stdlib-only)" echo " (This is fine if you want stdlib-only)"
fi fi
# Helper: check if import matches any package in an associative array (literal prefix, no glob)
matches_allowlist() {
local import="$1"
shift
local -n allowlist=$1
for allowed in "${!allowlist[@]}"; do
# Exact match
if [ "$import" = "$allowed" ]; then
return 0
fi
# Literal prefix match for subpackages (no glob interpretation)
if [ "${import#"$allowed/"}" != "$import" ]; then
Review

[NIT] The comment says 'POSIX-compatible' for the awk parsing but the outer loop uses a Bash process substitution < <(...) which is Bash-specific. The comment is mildly misleading — it means awk itself is POSIX, not the overall approach.

**[NIT]** The comment says 'POSIX-compatible' for the awk parsing but the outer loop uses a Bash process substitution `< <(...)` which is Bash-specific. The comment is mildly misleading — it means awk itself is POSIX, not the overall approach.
return 0
fi
done
return 1
}
# Get DIRECT dependencies only (exclude indirect/transitive) # Get DIRECT dependencies only (exclude indirect/transitive)
# Fail closed: if go list fails, we exit non-zero # Fail closed: if go list fails, we exit non-zero
IMPORTS=$(go list -m -f '{{if not .Indirect}}{{.Path}}{{end}}' all 2>&1) || { IMPORTS=$(go list -m -f '{{if not .Indirect}}{{.Path}}{{end}}' all 2>&1) || {
2
@@ -44,21 +76,13 @@ if [ -z "$IMPORTS" ]; then
exit 0 exit 0
fi fi
# Check all direct dependencies are in the allowlist
VIOLATIONS="" VIOLATIONS=""
while IFS= read -r import; do while IFS= read -r import; do
[ -z "$import" ] && continue [ -z "$import" ] && continue
# Check if import matches any allowed package (prefix match for subpackages) if ! matches_allowlist "$import" ALLOWED_PROD && ! matches_allowlist "$import" ALLOWED_TEST; then
MATCHED=false VIOLATIONS="${VIOLATIONS} - ${import} (not in allowlist)"$'\n'
for allowed in "${ALLOWED[@]}"; do
if [[ "$import" == "$allowed" ]] || [[ "$import" == "$allowed/"* ]]; then
MATCHED=true
break
fi
done
if [ "$MATCHED" = false ]; then
VIOLATIONS="${VIOLATIONS} - ${import}"$'\n'
fi fi
done <<< "$IMPORTS" done <<< "$IMPORTS"
@@ -68,17 +92,31 @@ if [ -n "$VIOLATIONS" ]; then
echo "The following imports are not in the allowlist:" echo "The following imports are not in the allowlist:"
printf "%s" "$VIOLATIONS" printf "%s" "$VIOLATIONS"
echo "" echo ""
echo "Approved packages (from CONVENTIONS.md):" echo "To add a dependency, update CONVENTIONS.md (requires Aaron's approval)"
for pkg in "${ALLOWED[@]}"; do exit 1
echo " - $pkg" fi
done
# Enforce Scope: test-only packages must not appear in non-test code
Review

[MINOR] The test-scope enforcement uses go list -deps -f '...' ./... | grep -v '_test' to identify production imports, but this approach is fragile: it filters by path containing '_test', which catches test packages but the -deps flag on non-test packages already excludes test files. More importantly, this will fail to detect a test-only package imported via a non-test file in a sub-package whose path happens not to contain '_test'. Using go list -f '{{if not .Standard}}{{.ImportPath}}{{end}}' $(go list ./...) (without -deps) and then checking against the allowlist directly would be more precise. That said, the current approach is a reasonable heuristic.

**[MINOR]** The test-scope enforcement uses `go list -deps -f '...' ./... | grep -v '_test'` to identify production imports, but this approach is fragile: it filters by path containing '_test', which catches test packages but the `-deps` flag on non-test packages already excludes test files. More importantly, this will fail to detect a test-only package imported via a non-test file in a sub-package whose path happens not to contain '_test'. Using `go list -f '{{if not .Standard}}{{.ImportPath}}{{end}}' $(go list ./...)` (without `-deps`) and then checking against the allowlist directly would be more precise. That said, the current approach is a reasonable heuristic.
Review

[NIT] The grep -v '_test' in the production imports step is unnecessary because go list -deps without -test already excludes test-only dependencies. While harmless, it can be removed for clarity.

**[NIT]** The `grep -v '_test'` in the production imports step is unnecessary because `go list -deps` without `-test` already excludes test-only dependencies. While harmless, it can be removed for clarity.
# Get imports used by non-test code only
PROD_IMPORTS=$(go list -deps -f '{{if not .Standard}}{{.ImportPath}}{{end}}' ./... 2>/dev/null | grep -v '_test' || true)
Review

[MINOR] Production imports discovery allows go list to fail open (2>/dev/null | ... || true). If go list fails, PROD_IMPORTS becomes empty and the script won’t detect misuse of test-only dependencies, weakening enforcement. Failing closed here would improve robustness.

**[MINOR]** Production imports discovery allows go list to fail open (2>/dev/null | ... || true). If go list fails, PROD_IMPORTS becomes empty and the script won’t detect misuse of test-only dependencies, weakening enforcement. Failing closed here would improve robustness.
TEST_ONLY_IN_PROD=""
Review

[MINOR] The scope enforcement uses grep -q "^${test_pkg}" which can produce false positives for modules with a similar prefix (e.g., github.com/google/go-cmppro would match github.com/google/go-cmp). Use a delimiter-aware pattern like ^${test_pkg}(/|$) to avoid accidental matches.

**[MINOR]** The scope enforcement uses `grep -q "^${test_pkg}"` which can produce false positives for modules with a similar prefix (e.g., github.com/google/go-cmppro would match github.com/google/go-cmp). Use a delimiter-aware pattern like `^${test_pkg}(/|$)` to avoid accidental matches.
for test_pkg in "${!ALLOWED_TEST[@]}"; do
Review

[MINOR] The regex "^${test_pkg}(/|\$|$)" has a redundant $\$ escapes the literal dollar sign for the shell but in the regex context inside grep -qE, the trailing |$) is a proper end-of-line anchor followed by the closing group, making \$ (literal $ character) unreachable in practice. This is harmless but the intent (match exact package or subpackage) is already handled by the ^pkg($|/) pattern. Consider simplifying to "^${test_pkg}(/|$)".

**[MINOR]** The regex `"^${test_pkg}(/|\$|$)"` has a redundant `$` — `\$` escapes the literal dollar sign for the shell but in the regex context inside `grep -qE`, the trailing `|$)` is a proper end-of-line anchor followed by the closing group, making `\$` (literal `$` character) unreachable in practice. This is harmless but the intent (match exact package or subpackage) is already handled by the `^pkg($|/)` pattern. Consider simplifying to `"^${test_pkg}(/|$)"`.
if echo "$PROD_IMPORTS" | grep -q "^${test_pkg}"; then
TEST_ONLY_IN_PROD="${TEST_ONLY_IN_PROD} - ${test_pkg} (marked 'test only' but used in production code)"$'\n'
fi
done
Review

[MINOR] The test-only production check uses grep -q "^${test_pkg}" without anchoring the end of the pattern. A package path like github.com/google/go-cmp would inadvertently match a hypothetical github.com/google/go-cmp-extended. Should be grep -qE "^${test_pkg}(/|$)" to match only exact or subpackage imports.

**[MINOR]** The test-only production check uses `grep -q "^${test_pkg}"` without anchoring the end of the pattern. A package path like `github.com/google/go-cmp` would inadvertently match a hypothetical `github.com/google/go-cmp-extended`. Should be `grep -qE "^${test_pkg}(/|$)"` to match only exact or subpackage imports.
Review

[MINOR] The grep regex ^${test_pkg}(/|\$|$) redundantly includes both \$ and $. It can be simplified to ^${test_pkg}(/|$) without changing semantics.

**[MINOR]** The grep regex `^${test_pkg}(/|\$|$)` redundantly includes both `\$` and `$`. It can be simplified to `^${test_pkg}(/|$)` without changing semantics.
if [ -n "$TEST_ONLY_IN_PROD" ]; then
echo "❌ TEST-ONLY DEPENDENCIES IN PRODUCTION CODE"
echo "" echo ""
echo "To add a dependency:" printf "%s" "$TEST_ONLY_IN_PROD"
echo " 1. Open a PR that ONLY updates CONVENTIONS.md" echo ""
echo " 2. Get explicit approval from Aaron" echo "These packages are marked 'test only' in CONVENTIONS.md"
echo " 3. After merge, use the package in a separate PR" echo "and must only be imported from *_test.go files."
exit 1 exit 1
fi fi
echo "✅ All dependencies are approved" echo "✅ All dependencies are approved"
echo " Direct deps: $(echo "$IMPORTS" | wc -l | tr -d ' ')" echo " Direct deps: $(echo "$IMPORTS" | wc -l | tr -d ' ')"
echo " Production: ${#ALLOWED_PROD[@]}, Test-only: ${#ALLOWED_TEST[@]}"